Analyzing the Memory Consumption of Eclipse

During my talk at the JUG Karlsruhe about the Eclipse Memory Analyzer I used the latest build of Eclipse 3.4 to show live, that there's room for improvement regarding the memory consumption of Eclipse.
Today I will show you how easy this kind of analysis is with the Eclipse Memory Analyzer.

I first started Eclipse 3.4 M7 (running on JDK 1.6_10) with one project "winstone" which includes the source of the winstone project(version 0.9.10):


 

Then I did a heap dump using the JDK 1.6 jmap command :

Since this was a relatively small dump (around 45Mbyte) the Memory Analyzer would parse and load it in a couple of seconds :

In the "Overview" page I already found one suspect. The spellchecker (marked in red on the screen shot) takes 5.6Mbyte (24,6%) out of 22,7 Mbyte overall memory consumption!
That's certainly too much for a "non core" feature.
Looking at the spellchecker in the Dominator tree :

reveals that the implementation of the dictionary used by the Spellchecker is rather simplistic.
No Trie, no Bloom filter just a simple HashMap mapping from a String to a List of spell checked Strings :

There's certainly room for improvement here by using one of the advanced data structures mentioned above.

My favorite memory consumption analysis trick


Now comes my favorite trick, which almost always works to find some memory to optimize in a complex Java application.
I went to the histogram and checked how much String instances are retained:


12Mbyte (out of 22,7), quite a lot! Note that 4 Mbyte are from the spell checker above (not shown here, how I computed that), but that still leaves 8 Mbyte for Strings.
The next step was to call the "magic" "group by value" query on all those strings :

Which showed us how many duplicates of those Strings are there:

Duplicates of Strings everywhere


What does this table tell us? It tells us for example that there are 1988 duplicates of the same String "id" or 504 duplicates of the String "true". Yes I'm serious. Before you laugh and comment how silly this is, I recommend you to take a look at your own Java application :] In my experience (over the past few years) this is one of the most common memory consumption problems in complex java applications.
"id" or "name" for example are clearly constant unique identifiers (UID). There's simply no reason why you would want that many duplicates of UID's. I don't even have to check the source code to claim that.

Let's check which instances of which class are reponsible for these Strings.
I called the immediate dominator function on the top 30 dominated Strings :

org.eclipse.core.internal.registry.ConfigurationElement seems to cause most of the duplicates ,13.242!

If you look at the instances of the ConfigurationElement it's pretty clear. that there's a systematic problem in this class. So this should be easy to fix by using for example String.intern() or a Map to avoid the duplicates.

Bashing Eclipse?


Now you may think, that this guy is bashing Eclipse, but that's really not the case.

If you beg enough, I might also take a closer look at Netbeans :]

0
Average: 4.7 (3 votes)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

eckes replied on Tue, 2008/05/20 - 6:44pm

Thanks for the Talk (in Karlsruhe) and the good followup Article. BTW: the SDN Version does not seem to have the Dashboard overview. So you shoed us already the 3.4 Eclipse Version in Karlsruhe? Missed some of the reports.

AndreasBuchen replied on Wed, 2008/05/21 - 2:13am

With the open sourcing, we deliver the Memory Analyzer only via the Eclipse: http://www.eclipse.org/mat/downloads.php (via Update Site or as stand-alone RCP application).

We also created an Update Site on SDN which serves the NetWeaver extensions (e.g. extract and analyze Server Sessions, Caches, etc.).

Markus Kohler replied on Wed, 2008/05/21 - 2:48am in response to: eckes

Hi Bernd, (I'm guessing ;) )

Thank you too for attending :)

Well, I have to admit, that I was not aware that the Dashboard is only available in the unreleased version of the SAP Memory Analyzer version that I used. 

It does have the same features as the Eclipse Memory Analyzer plus some SAP specific commands (which I didn't show). I used it because I trusted it more than the new Eclipse Memory Analyzer.

 I just learned from Andreas, that we now have an update site for those plugins.

Regards,

Markus

AndreasBuchen replied on Wed, 2008/05/21 - 4:04am in response to: mkohler

Just to clarify: the "SAP Memory Analyzer" ceased to exist. It was reborn at Eclipse. That's where you find the latest downloads and that version also includes the dashboard.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.