SophieK: Understanding the basics behind a profiling report

Hi !

So in my last post about the VM profiler, I mostly described how to install the profiler in Pharo and roughly how to read the textual report.

This time, I will go through a report in detail and I will try to explain how does the profiling work and refer to relevant articles. The profiling report can be divided in 3 parts, and each part will be the subject of a blog post. I won't go into the details of the VMProfiler code ; if you're interested in that part or if you already are familiar with the VM profiler, I suggest you read this article from Eliot Miranda which gives you a more precise insight about how the profiler works.

So, let's begin :

Part 1 - General data about the profiling

Let's profile the following block: 1 to: 100000000 do: [ :i | Object new ]. What does the first part of the report say ?

/Users/sophiekaleba/Documents/WIP-part1/Pharo/Pharo.app/Contents/MacOS/Pharo 2017-07-10 15:35:15
eden size: 3,801,936 stack pages: 50 code size: 1,048,576
1 to: 100000000 do: [ :i | Object new ]
gc prior. clear prior.
1.409 seconds; sampling frequency 1326 hz
1838 samples in the VM (1868 samples in the entire program) 98.39% of total
1757 samples in generated vm code 95.59% of entire vm (94.06% of total)
81 samples in vanilla vm code 4.41% of entire vm ( 4.34% of total)

/Users/sophiekaleba/Documents/WIP-part1/Pharo/Pharo.app/Contents/MacOS/Pharo 2017-07-10 15:35:15

That's the path of the VM used, and the date and time of the profiling. Fairly easy for now.

eden size: 3,801,936 stack pages: 50 code size: 1,048,576

Ok. A bit more complex for the uninitiated.

The eden size (in bytes) refers to the size of the "young space" in the virtual machine (Cog) memory. This is the space where newly allocated objects are stored. If you want to learn more about how the Cog memory is divided and why, I deeply recommend you read this article.

This data is relevant in this report because, in some cases, the size of the eden can cause an increase of execution time : for instance, if the Eden is too small, and the running program is actually allocating a very large amount of objects, it can trigger a lot of scavenges (basically, moving the stored objects in a particular space in the Eden). The scavenges are performed once a specific threshold is reached in the eden.

Stack pages. The VM representation of the stack is a linked list of stack pages. By default, the number of stack pages is set to 50. Like the eden size, this piece of information is significant, as in some specific cases, 50 pages are not enough and could lead to an increase in execution time.

Code size. This one deals again with the Cog memory (you can refer to the same article). It refers to the size of the Cog method zone, i.e. the memory space containing trampolines/enilopmarts, Cog methods, CogFullBlocks, PICs, and a linked list of elements with young referents.

This zone, when full, can be compacted which can have an impact on execution time, especially if compactions occur frequently. It is possible to modify the Code size to reduce the amount of compactions and speed up the execution time.

1 to: 100000000 do: [ :i | Object new ]

The block you profiled.

gc prior. clear prior.

gc prior. a GC is performed just before the profiling starts, to make sure we start from a fresh environment (so we know that if a GC is performed during the profiling, it is actually caused by the execution of the block )

clear prior. Besides the GC, some other operations are performed before the profiling starts. It includes making sure that the buffer containing the samples is cleared, initializing the profiling time and the data structure (a Bag) that will later hold the samples.

1.409 seconds; sampling frequency 1326 hz

Total time of the profiling. And the sampling frequency, usually between 1,3 and 1,4 Khz.

1838 samples in the VM (1868 samples in the entire program) 98.39% of total

Total number of samples gathered during the profiling. A sample is basically an instruction pointer, pointing at the currently executed function. So we can simplistically say that a sample refers to a function. I will be more specific about that part in another post (soon).

A difference is made between the "samples in the VM" and the "samples in the entire program" : indeed, some samples are not referring to functions in the VM, or are simply not identified. These are the 30 samples difference you can notice above.

All the samples are then split between 3 categories :

the samples in generated vm code
the samples in vanilla vm code
the remaining samples

1757 samples in generated vm code 95.59% of entire vm (94.06% of total)
This is the total amount of samples in generated vm code, that is to say all the samples referring to a function / trampolines / enilopmarts... in the Cog method zone.

81 samples in vanilla vm code 4.41% of entire vm ( 4.34% of total)

This is the total amount of samples in vanilla vm code, that is to say all the samples referring to a function in the C code of the VM (typically, the interpreter and the garbage collector).

That's all for this part, if you spot any mistake or have any question, feel free to comment !

SophieK

jeudi 17 août 2017

Understanding the basics behind a profiling report - Part 1: general data

Part 1 - General data about the profiling

Aucun commentaire:

Enregistrer un commentaire