LibreOffice crash reporting – An update

Nearly a year ago I wrote a blog post describing the LibreOffice crash reporting setup and how the crash reporting code works. Since then we have released two minor versions with the crash reporter enabled (5.2 and 5.3) including many bug fix releases and release candidates. According to the crash reporting server a total of 27 versions are recognized and it is time to list some of the statistics surrounding the crash reporter.

During the one year of operation we had about 900,000 uploaded crash reports with around 152,000 of them reported in the last 30 days. The number of crash reports might sound like a lot, but a huge amount (around 430,000) have been reported for the 5.2.4.2 release. We introduced a regression in the shutdown code during the 5.2.4.2 development window which resulted in a crash if the clipboard was not empty during shutdown. Due to the crash reporter we were able to fix the crash quickly and release a fixed 5.2.5 version ahead of schedule. Without the crash reporter we would not have noticed the problem that quickly.

During the early 5.3 releases we saw a spike of random crashes in the window generation code that we could not really explain. After some careful analysis, it was concluded that these crashes happen as a result of GDI resource leaks. Amazingly, on windows you are only able to hold references to 10,000 GDI handles (fonts, bitmaps, …). Some of the GDI leaks have been fixed ([1][2]) already but we are still chasing some more. At the time of writing, running out of GDI handles is still the number one source of crashes in the 5.3 releases. In an attempt to make it easier to spot these crashes the number of open GDI handles are now reported as part of the metadata by the crash reporter (example in a recent 5.3.2.2 crash). Starting with the 5.3 release the metadata also contains the localization of the installation as we found a few crashes that appear to be locale dependent.

As the transformation from a stack and memory dump to a human readable stacktrace happens on the server we need symbol information to map stack information to source code files, method names and line numbers. These symbol information currently occupy 97GB of data on the server with many of the symbol information being for system libraries. Despite this large amount of symbol information we have still recorded around 100,000 libraries without associated symbol information. These range from system libraries without corresponding debug information, libraries for multimedia codecs, OpenGL and OpenCL drivers to antivirus solutions that inject code into LibreOffice. As part of the symbolization we received around 11,000 different signatures for crash reports. However many of them are actually crashes caused by the same problem and just crash in different places. As an example the 5.2.4.2 crash during shutdown is known with many dozens if not even hundreds of different crash signatures.

Additionally, 179 bug reports have been connected to the crash reporter and 42 commits to the LibreOffice master branch reference the crash reporter. Most of these are attempts to fix reported crashes. Sadly we don’t have a common way to reference the crash reporter yet so there are surely more commits that I could not find with a quick search.

With the next minor release (5.4), it is expected that the Linux version of our signal handler allows crash report generation even when the LibreOffice signal handler overwrites our crash reporting signal handler. All in all the crash reporter works amazingly well and we have been able to fix a number of of serious crashes already. With more time we hope to be able to fix more of the reported crashes and therefore improve the quality of LibreOffice even more.

 

As always, help in fixing and researching some of the reported crashes is highly appreciated. Especially the more obscure crash reports require detective work to trace them back to their cause.

Additionally help with the crash reporting server is always welcome. The server is written in python with django and some javascript. No C++ or LibreOffice code knowledge required 😉

About Markus Mohrhard

Hacking at Libreoffice calc
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment