Anyone familiar with testing in LibreOffice knows that for a long time we have had one huge weak point in our testing infrastructure. Until now we were more or less unable to do any UI testing which means we had no chances to test a fairly big part of our code base. We have existing tests for import and export, for performance problems, for normal low level classes and even two frameworks for our UNO API. More or less for everything except for our UI we already have a solution that works quite well.
This post will outline why it took until now to add the UI testing framework and how the design tries to solve some of the weak points of existing UI testing approaches.
Difficulties of UI testing
I have been discussing different designs for an UI testing framework for a few years with other developers at various conferences but never felt like one of the designs would not cause more problems than it solves. I’ll outline below how the new UI testing framework solves some of the problems that I have with other concepts.
My list of requirements for a good (UI-)testing framework are:
- Easy to add a test: This is even more important in an UI testing framework where we ideally want the UX team to add test cases as well.
- Stable across unrelated changes: I don’t want to see failing tests because someone changed an unrelated part of the code. In the UI case this also includes that the test should not fail because someone renames a button or moves an UI element. This requirement is meant to reduce the maintenance costs of the tests.
- As close to the code paths triggered by an user as possible: We want to test the code that is used in production and not some random test code.
- Does not need random sleeps: Needing to add sleeps to deal with asynchronous events makes executing the tests brittle and includes random false positives.
- Low number of false positives: Obviously if the number of false positives is too high developers start to ignore test failures.
This is a quite complex list and even the proposed solution might not always be able to fulfill all the requirements but at least it should improve compared to the other considered solutions. The most common solutions for UI testing that people propose are using the accessibility framework or hard coding the path to UI objects. However in my opinion both solutions have some problems that make them unsuitable as testing concepts in LibreOffice.
The hard coded path approach obviously is not stable across UI changes and becomes brittle across different OS versions. The big advantage compared to many other approaches is that you can in theory just record an user interaction and replay it during the tests. Also it would work with any version of LibreOffice without any changes but you pay with increased maintenance costs.
The accessibility based approach suffers from our poor quality of the accessibility code (some people will argue that it is a chance to clean it up) and that the code is not that close to what happens if a non-accessibility user does an action. Sadly the code is not yet stable so at least for a long time until we manage to improve the accessibility code significantly we would suffer from regularly failing tests which in the long run means that developers start to ignore the tests.
Compared to the hard coded path approach the accessibility idea is already an improvement from the maintainability based approach and is used in quite a few programs. However it still does not completely solve the problem of changing dialogs and it is much mroe complicated to write a test.
To solve the problems outlined above the new framework trades upfront costs for easier maintenance later. From a mile away the UI testing approach is based on a bit of newly written introspection code interacting with a testing framework in python through a simple mostly un-typed UNO interface. To identify objects we use the ids that we introduced for loading dialogs from UI files.
The introspection code is a small bit of code around UI objects providing a simple standard interface to each UI object. The interface provides methods to get state and properties of the UI objects as well as sending commands to the objects. For a button a property could be a label and an action could be sending a click command. The introspection library would then call the method on the VCL button object that is called by a real button click.
Obviously we are still not directly taking the same code paths with this introspection code as a real mouse click but we can get as close as possible as any automation framework that is not sending real mouse events can get.
Introspection code for many commonly used UI objects is already available and adding new code for UI objects that are not yet covered is simple. More details on how to add such introspection objects will be added to the wiki page soon.
To solve the problem that many UI operations happen asynchronously we have added some events (for now only for opening dialogs) and will use events as a way to make the tests more reliable. Existing events that get reused are the events for signaling that a document is ready and that the document has been closed.
Finding an object is solved through the ID that we most of the time get from the ui file. Nearly all of them are locally unique so they make for some great identifiers that stay unchanged even when a dialog is reorganized or a new element is introduced. Obviously this would still cause issues when a button is moved from one container widget to another one and we ask the old one for a child with the ID. However as we assume locally unique IDs we just always walk up to the top level window (either of the dialog or the whole document) and search for there.
Python testing framwork
The actual tests are not written in C++ and while I’m normally not a fan of using different languages for writing tests it seems like the right decision here. The tests only interact with LibreOffice through the UNO interface and we want to lower the barrier for adding UI tests compared to our complicated internal C++ tests (which are still much more useful for anything that you need to debug regularly). Additionally debugging of the UI tests can normally happen by invoking the UI operations manually whereas debugging one of our other tests is bound to some function calls.
The testing framework is split into three parts: the test cases, the framework code specific to UI testing and useful UNO code. Splitting the last part out from the rest allows us to collect useful code snippets that can be used in other places or even by external developers interacting with LibreOffice. An example that is already in the framework code is the code used to start LibreOffice and create a socket connection.
Adding a new test mainly requires to create a class that derives from the main test class and through introspection each method whose names starts with test will be executed (an example; the sleeps are just for visual inspection).
To make the life easier there is a UITest object that provides access to some common operations like opening a dialog and waiting for the event that the dialog is ready.
The implemented test framework trades stability of the tests against less code. For now only the most common UI objects have introspection code, which still covers around 90% of the objects, but there are still corner cases where we need to add more code to the introspection library. Nevertheless you can already write tests for many of the dialogs and can already inspect some of the demo code that I have written.
I’m also still chasing two race conditions that make the tests deadlock or fail. The deadlock case is already handled by a patch that is in gerrit and just not yet pushed to the repository, but the case that Impress tests fail with the new template chooser is still under investigation(the time.sleep in  are currently mandatory).
Another open issue is still how to best organize running the uitests under gdb. There is some support already but sadly it is not working as well as it should.
I’m going to add more documentation in form of blog posts and wiki pages soon but you can already get an idea of how the tests work by inspecting the code in the uitest module.
Executing the tests should mainly work (except for the two problems mentioned above) with a make uicheck or make uitest.uicheck after a successful make.
Please report all problems and suggestions to me so that I can integrate them into the new framework.