What's good and what's bad about profiling nowadays

Alexander Kitaev

Anton Katilin / Co-founder and CEO / YourKit

Anton Katilin was formerly a lead developer on the IntelliJ IDEA team. A year ago, he left JetBrains to start YourKit LLC. Co-founder and CEO of the new company, Anton is also one of the creators of the YourKit Java Profiler, an innovative new profiling tool for Java™.
Contact Anton at: anton at yourkit.com

There are lots of developers who already understand the necessity of performance tuning of Java applications. They know about profilers, and most likely use them in everyday developer's life. These people reasonably ask: why should we use this or that tool, why one is better than another, and finally which one is the best?

1. Approach to memory profiling

Let's start from the beginning. The first critical issue is memory profiling. Being a developer of a considerably big project, one often faces memory-related performance problems. The program starts "eating" memory, and/or does not reclaim it when it has to. Surely, there are profilers to handle this. But the lion's share of them have one common drawback: first, it takes extremely much time to start a program under the profiler, and then you have to wait long while it reaches the memory leak state. What makes it even worse, the leak may happen only under very rare and even unknown, obscure conditions, and often no one has any idea how to reproduce the problem. And it would be great to be able to analyze the heap of every application once a memory problem is found. Not long ago this was impossible, because one could not always run an application with a profiler - it would have lead to unacceptable performance degradation. That was the case, at least with profilers that existed before. Then I thought: isn't there indeed any way to take all the necessary information from a running Java application when it is needed without making the application slow as hell during its entire run time? After a series of researches and tries, my colleague and I managed to find the way out.

The clue is not to rely on time-consuming processing of events such as object creation or destruction by the garbage collector, i.e. not to record object allocations. To find and fix a memory leak, it is not that interesting to know who created the object, but rather who holds it, so that this approach works perfectly. Instead, there can be an optional ability provided, to turn object allocation on/off when needed, for those who need [I'd better say 'want' or 'think they need'] it. Frankly speaking, I personally do not find it useful to know where a particular live object is created. The useful aspect of allocation recording is to find the code that produces a lot of 'garbage', i.e. temporary objects. Modern garbage collectors work fast and temporary objects are not that big a problem. Anyway this optimization matters, because it always takes less time [for a garbage collector] not to do a job at all, rather than do it even if it is done fast.

The deep analysis of the memory profiling problem turned into effective result: we have developed such a tool (namely YourKit Java Profiler), with which an application can run with NO overhead, and is ready to tell all about its memory state exactly WHEN it is needed. Currently YourKit is the only profiler that addresses the issue of memory profiling adequately and in a user-friendly way.

2. Approach to CPU profiling

An approach similar to that discussed in the previous section can be applied to CPU profiling. An application starts and runs at full speed. When the need arises, the profiler is activated; it starts recording profiling data, which of course may lead to a performance overhead. This lasts until the application finishes the particular task you are interested in, and then you capture a snapshot with all collected information. The application goes its way, again at its full speed. You go your way - open the collected information in the profiler UI and study it.

The purpose of the tool matters. A useful profiler should not be just a tool primarily targeted to show beautiful views jumping synchronously with an application being analyzed, with no regard to how much this "live show" slows down the application. Alternatively, it should help finding slow parts of a running application and show or give a hint how to make them faster. This theory became the basis for YourKit profiling ideology.

This may be easily taken as radicalism. And radicalism is not always good. Thus we are currently working on adding support for some useful "telemetry" information that illustrates on application's time line. That should be an addition to the main "start - wait - capture snapshot" approach.

3. Filters

In general, measuring is intrusive. That's a law of nature. No measuring tool can measure without affecting the thing it measures, either in the software world or in the real physical world. Capturing a memory snapshot may pause the profiled application for a couple of seconds, CPU sampling gives a very small, invisible, non-perceptible but still existing constant overhead for periodic inquiring into stacks of running threads. When the CPU is profiled with tracing (which handles method entries and exits), or an option to record object allocations is used, intrusion is more significant.

There's an approach for reducing this intrusion by means of reducing the scope of code (methods, classes) to measure (profile). As a result, the code you measure runs slower, but the rest of the code runs fast. So you can skip profiling of non-interesting parts of an application, and profile only interesting parts. This approach is widely used in different profilers.

It's all about saving your time. But... the saving is doubtful.

This approach works, but there's a problem with it. Which appeared earlier: the chicken or the egg? I should be interested in the profiling of slow parts of my application, the particular code that causes performance problems. But how can I know which part of my code is "problematic" and which is not, i.e. which classes, entry points I should profile, other than learning this from profiling results? Of course sometimes it is a priori known, but sometimes not, because I cannot hold my entire project details in my head. Even libraries, the good candidates for skipping "by default", should sometimes not be skipped. For example, in my own practice there were cases when without analyzing the behavior of core Java library classes (thanks to Sun the sources are available) it was very hard to understand the reason for a performance problem.

As a workaround, this approach can be applied iteratively: launch profiling session, analyze results, reduce scope of profiled code, launch another session, etc. But:

  • This takes human time and human attention, which is more valuable than computer time. Were's my time saving?
  • What if I understand the need to change profiled code scope, but cannot re-launch the profiling session? E.g. there's a problem which is very hard to (or unknown how to) reproduce.

In YourKit Java Profiler, we decided to use the alternative approach: in two words, no filtering in runtime. And I bet it does save time:

  • Profiling scope is reduced by reducing time periods of heavily intrusive profiling:
    • in most cases, heavily intrusive profiling modes, such as allocation recording and CPU tracing, are not used at all; allocation recording is optional for memory profiling; sampling as a CPU profiling method gives good results in most cases, having very small overhead
    • one can turn on the heavily intrusive profiling modes when needed only, and for limited periods of time
  • Human needs to think less, and that's good. Snapshots contain all the data, and filters are only a UI option. You always can change your mind without re-launching measuring sessions. A human has the right to make a mistake and to change his mind, and that's good.

4. IDE integration

Tools should be integrated; otherwise it's not easy to work with them. Developer's tools should be integrated into developer's IDE.

What kind of tasks can the profiler's IDE integration provide? Basically, there are two:

Figure 1
Figure 1.
  • launching profiled program from the IDE with minimal, or better yet, no inconvenience
  • providing easy access to the source code of the profiled application from profiling results
1. It is logical to seamlessly incorporate profiler-specific commands and settings into the native IDE user interface:
a) The Profile action is convenient to add to the IDE menu, so that it can be called like any other command. For example, in IntelliJ IDEA YourKit adds the Profile action where similar Run and Debug actions are allocated (namely under the main Run menu, in context menus, on the toolbar):
Figure 2
Figure 2.
b) Additional profiler-specific settings should be incorporated into the IDE Run Settings, as it is, for example, done in the IntelliJ IDEA integration:
Figure 3
Figure 3.
c) The output console should not be different from the ones provided by the IDE. It should look familiar, in order not to produce any confusion. Taking IntelliJ IDEA integration as an example, we can see that the profiler console is absolutely the same as that used by Run and Debug commands, as it looks and behaves the same way:
Figure 4
Figure 4.
2. Many profilers still have code access problems, as they can open the source code only in their own editors, which is not convenient at all. The first thing that crosses one's mind when talking about quality integration, is the ability to view the source code (coming from profiling results) exactly where one usually works with code, i.e. in the native IDE editor. At the same time, there should be some dedicated standalone UI for analyzing profiling information, specifically designed for convenient output. The last two statements may sound mutually exclusive, but when this theory is turned into reality, it becomes simply the right tool:

So, these were the two major issues. And there are obviously more requirements for a good profiling tool.

Having summarized the above, I would say that the following main features define an integrated profiler that covers the most important developer needs:

  • quick and transparent setup process
  • supportfor the most popular IDEs (e.g. IntelliJ IDEA, Eclipse, or Borland JBuilder)
  • use of IDE-specific UI, possibly along with own standalone UI
  • one-click ability (with devoted shortcut) to launch the profiled application from an IDE
  • use of the native IDE editor for showing the source code from profiling results

The only question that might still remain concerns the use of the standalone UI. This is, in some sense, a matter of preferences, and many developers would prefer inlined views. This is the feature we are already working on, to provide this option for the YourKit users who do not want to leave their IDE at all.

5. Usability

Feature sets of same-purposed products are often similar. But products are not equally easy and comfortable to use. Details make the difference.

The usability aspect in a profiling tool is worth mentioning. There are several main usability drawbacks that are often met in different profilers:

  • Many existing profilers are mouse-centric, while keyboard is the main working device of a developer. Vendors of big IDEs, having analyzed why many developers still prefer text editors, have started applying the keyboard-centric approach in their products. The same is topical for other everyday tools that developers work with, and profilers are no exception.
  • Working with regular profilers, a user is forced to once again configure the existing project parameters, like directories for compiled classes, sources, or jars used. Is it really a good idea to make a profiler a kind of "launching center" for the profiled application? It seems much easier to add a very small profiler-enabling stuff to the existing configuration, rather than make a user repeat all project configuration steps again, just for the profiler. This can be achieved by just adding one parameter to the application/project.
  • An inconvenience is also met in profilers that open the "problem code" in their own windows, while it is obvious that the ability to work on code in the preferred environment is what a developer would prefer. Opening the "native" IDE editor would provide access to all the advanced IDE features, which would make the fixing process really user-friendly.
  • When it comes to manual configuration for profiling an application (for whatever reason), specifying correctly all the necessary parameters takes much time and is not always an obvious procedure. For example, setting bootclasspath, to enable profiling in an application directly, can sometimes be a rather painful and error-prone task. Why not making things easier, like by just adding the application command line parameter that would point to the profiler agent library?

Having analyzed the existing drawbacks and possible ways to get rid of them, we realized that applying these solutions would definitely brighten the life of those developers who value usability and comfort in their daily work.

All of these usability improvements, as well as many others, are already implemented in YourKit. And there are more to come.

These were just a few thoughts of mine. If you want to discuss them, you are welcome to our forums (http://forums.yourkit.com)

Anton Katilin / YourKit