Easier web app testing by mapping entities to UI objects

Automated, browser-based testing is a key element of web application development, benefiting both simple and complex applications. Writing effective tests for browser-based apps can be a complex, tedious and often repetitive task. In this post, I will be discussing a general approach to write meaningful, loosely-coupled UI tests for web applications by going beyond the Page Object Design Pattern into a more fine-grained approach I call ‘Logical entities to UI object mapping‘. I will show code in written Java 8 leveraging the Selenium and Selenide frameworks to show examples of the method described.

Layers of web app testing responsibility
Layers of web app testing responsibility

In web development, a common component used to perform browser-based testing is Selenium, which is a suite of frameworks and tools to automate browsers. All code in the Selenium projects is licensed under Apache 2.0. Selenium WebDriver is the most important component, which exposes an API to control several web browser and browser engines, and includes several language bindings: Java, C#, Python, Ruby, Perl, PHP and Javascript.

Page Object test design pattern

Selenium and the WebDriver API are very flexible and relatively easy to use. Make sure to check out the Test Design Consideration page of on the Selenium documentation for tips on how to structure your code, specially the Page Object Design Pattern section. There is a great post by Martin Fowler on the subject, plus lots of resources and blog posts on the web.

The following example is from the Selenium documentation, showing how coding a test using the straightforward WebDriver API can result in complex code that mixes many concerns:

 * Tests login feature
public class Login {

        public void testLogin() {
                selenium.type("inputBox", "testUser");
                selenium.type("password", "my supersecret password");
                Assert.assertTrue(selenium.isElementPresent("compose button"),
                                "Login was unsuccessful");

This kind of code, specially when adding lots of selectors and adding more test logic, can quickly transform into hard to maintain, spaguetti code.

When using the Page Object pattern, the original test code can be refactored into a more OOP class that has more semantics:

 * Page Object encapsulates the Sign-in page.
public class SignInPage {

        private Selenium selenium;

        public SignInPage(Selenium selenium) {
                this.selenium = selenium;
                if(!selenium.getTitle().equals("Sign in page")) {
                        throw new IllegalStateException("This is not sign in page, current page is: "

         * Login as valid user
         * @param userName
         * @param password
         * @return HomePage object
        public HomePage loginValidUser(String userName, String password) {
                selenium.type("usernamefield", userName);
                selenium.type("passwordfield", password);

                return new HomePage(selenium);

There are also fair warnings on the tradeoffs involved in applying the pattern, as described in this blog post.

In this case the warning is focused on the risk of grouping unrelated semantic and intent in aggregation ‘page’ classes.

In my case, I also tend to stay away from ‘page’ classes and write UI abstractions around the application’s logical entities. Logical entities are mapped to UI objects and then accessed naturally through their relationships so user actions and navigation are kept intentional. To also keep code within the domain of UI testing, most UI class methods are explicitly called after common user browser actions like clicking, moving the mouse or entering text into forms. In practice I have found that this combination is a ‘sweet spot’ combination of useful abstraction, loose coupling, testability and practicality.

The following diagram illustrates the concept for a typical content management system three level master-detail:

Grouping logical entities in semantic web UI testing, showcasing method names
Grouping logical entities in semantic web UI testing, showcasing method names

I will be showing next some specific examples of applying the ‘Logical entities to UI object mapping’ pattern in Java 8, using Selenide on top of Selenium to automate the browser.

Introducing Selenide

Surprisingly less well-known than it should be, Selenide is a great Selenium WebDriver abstraction layer written in Java that makes it easier to manipulate browsers and write tests than just by using Selenium. Selenide simplifies writing tests based on Selenium with a surprisingly intuitive interface. To demonstrate the simplicity, here’s an example from the getting started:

public void userCanLoginByUsername() {
  $(".loading_progress").should(disappear); // Waits until element disappears
  $("#username").shouldHave(text("Hello, Johny!")); // Waits until element gets text

The interface offered is easy to use and you usually don’t even need to look at the documentation. Waiting for elements to appear is implicit in many parts of the code and Selenide generally works as you would expect. I encourage you to try it, write a few tests and you will be surprised at the simplicity.

Regardless of the added convenience and abstraction of the Selenide library, we can still end with spaguetti code if we are not careful. For instance, if we use CSS selectors to access one UI element with $(org.openqa.selenium.By seleniumSelector) and want to change related classes in the client-side, we need to go everywhere in our testing code and change the CSS selector references.

Separating UI handling from tests

In a OSS project I’ve been recently working on (Morfeu), I am doing extensive testing of the web UI and I’m using the Logical entities to UI object mapping approach all over the place. For instance, in the project there is a list of catalogues (like a master-detail listing) as a logical application entity. Each catalogue has a list of different catalogue entries, which in turn contain documents (not shown in the example), completing a simple three layer of logical app entities. The code to operate and access the master catalogue list is as follows:

public class UICatalogues {

private static final String CATALOGUE_LIST = "#catalogue-list";
private static final String CATALOGUE_LIST_ENTRY = ".catalogue-list-entry";

public static UICatalogues openCatalogues() {
	return new UICatalogues();

public UICatalogues shouldAppear() {
	return this;

public UICatalogues shouldBeVisible() {
	return this;

public List allCatalogueEntries() {
	return $$(CATALOGUE_LIST_ENTRY).stream().map(e -> new UICatalogueEntry(e)).collect(Collectors.toList());

public UICatalogue clickOn(int i) {
	List catalogueEntries = this.allCatalogueEntries();
	int count = catalogueEntries.size();
	if (i>=count) { 
		throw new IndexOutOfBoundsException("Could not click on catalogue entry "+i+" as there are only "+count);
	return UICatalogue.openCatalogue();


To start with, public static UICatalogues openCatalogues() is used to open the catalogues list, it’s a static method as one of the constraints is that there is only one catalogue list and that it appears as soon as the user loads the application. A static method is a convenient way to access the ‘catalogues’ object instance as opposed to a public constructor, which maps to the application semantics. If there was a need to change the behaviour into requiring user action to load the catalogues (such as a mouse click) the implementation could be changed without changing the caller code. If the change were to be quite radical and significantly change the logical entity relationships, like allowing multiple catalogues, the API and caller code could (and should) be refactored to reflect the big change in application behaviour.

The methods UICatalogues shouldAppear() and shouldBeVisible() are pretty self explanatory, and can be used to ensure the catalogue list loads and displays properly.

The methods are quite simple themselves and the usage of Selenide is pretty much self-explanatory.

The next method List allCatalogueEntries() is used to obtain the list of catalogues, and takes advantage of Java 8 streams, mapping the list of found elements into new UICatalogueEntry instances:

public List allCatalogueEntries() {
	return $$(CATALOGUE_LIST_ENTRY).stream().map(e -> new UICatalogueEntry(e)).collect(Collectors.toList());

A stream of low-level catalogue entry elements found by Selenide is mapped to new catalogue instances and then collected into a using Java 8 list collector.

UICatalogueEntry‘s constructor accepts a SelenideElement to provide each catalogue entry instance with its local context. SelenideElement instances are a wrapper of Selenium elements and are the basic building blocks of testing using Selenide.

The last method UICatalogue clickOn(int i) is a convenience method to click on a specific catalogue entry and is also quite straightforward:

public UICatalogue clickOn(int i) {
	List catalogueEntries = this.allCatalogueEntries();
	int count = catalogueEntries.size();
	if (i>=count) { 
		throw new IndexOutOfBoundsException("Could not click on catalogue entry "+i+" as there are only "+count);
	return UICatalogue.openCatalogue();

UICatalogue is also a semantic web UI class that includes the click() method that loads a catalogue, which is returned (in this case also without any parameters in the current implementation).

It should be noted that there are other valid ways to design this. The clickOn method only uses public methods so it could rightly be perceived as ‘client’ code, but as clicking on a catalogue is done very often, this is offered as a way to avoid repetition. It’s important to spend some time thinking about this design, giving consideration to style, operation frequency, potential for repetition, amount of convenience methods and so forth.

Also a bit of a personal code style choice, I am staying away from get/set semantics to better distinguish UI test logic from typical application code (which commonly employs get/set prefixes). A backend code getter could potentially perform a complex operation while the UI code just selects a CSS class and reads a value displayed on the page, hence the shorter method name. Following this logic, user action methods like clickOnXXX will be the ones performing complex operations like navigating to a different page and so forth, so they have a more explicit verb prefix. Of course getter/setters can be used if that suits more your style.

Using the abstraction UI classes

Usage is also straightforward, like this method:

public void catalogueListTest() throws Exception {
	List catalogueEntries = UICatalogues.openCatalogues()
	assertEquals(EXPECTED_CATALOGUES_COUNT, catalogueEntries.size());
	assertEquals("Wrong catalogue content", "Catalogue 1", catalogueEntries.get(0).name());
	assertEquals("Wrong catalogue content", "Catalogue 2", catalogueEntries.get(1).name());
	assertEquals("Wrong catalogue content", "Catalogue 1 yaml", catalogueEntries.get(2).name());

This code is easy to read, and shows exactly what kind of behaviour the is being tested. In this case, it’s testing that once the application is loaded, the catalogues should appear, there should be EXPECTED_CATALOGUES_COUNT entries, and also the particular catalogue order and names. Finally, no error should be shown. Also note the readability of catalogueEntries.get(0).name() vs a more verbose catalogueEntries.get(0).getName().

More complex behaviour is also modelled quite well:

UICellEditor stuffEditor = stuff.edit().shouldAppear();
Optional value = stuffEditor.getValue();
assertEquals("Stuff content", value.get());
assertFalse("Should not be able to create a value for this cell",  stuffEditor.isCreateValueVisible());
assertTrue("Should be able to remove value for this cell",  stuffEditor.isRemoveValueVisible());

value = stuffEditor.getValue();

In this case the code is testing a form that contains a value and can be removed entirely (as opposed to cleared) by clicking on a specific widget or button. Once the widget is activated, the form value disappears and is not there anymore.

More examples like these can be found in the Morfeu project tests.


It is definitely useful to abstract web UI testing using techniques like ‘Logical entities to UI object mapping’ described in this post or the Page Object mapping pattern. If the right abstraction level is applied correctly, test are more meaningful, their intent is more obvious and the actual web app implementation can be changed more easily. Techniques applied with tools like Selenide make writing the semantic UI code even easier and combined with Java 8 stream support, testing code ends up being super-fun to write.


For more examples please take a look at examples from the Selenide documentation on the direct UI and test mix approach:

  public void search_selenide_in_google() {
    $$("#ires .g").shouldHave(sizeGreaterThan(1));
    $("#ires .g").shouldBe(visible).shouldHave(
        text("Selenide: concise UI tests in Java"),

(Note that the example is intended for simplicity and separation of concerns is not the aim of the code). The code is pretty readable and concise, but could be improved by applying the Page Object test design pattern or the one described in this blog post. Alternatively, for smaller tests having the classes in constants or in a test configuration could also be practical.

The Selenide project also has some examples of the more semantic approach using Page Objects here.

provashell – testing shell scripts the easy way

In this post I will describe the provashell project, an Apache 2.0 licensed bash and shell Unit Testing library using annotations. The library is a self-contained script file with no dependencies that can be embedded or used easily in your project. Extra care has been taken to use POSIX shell features and the code has been tested in the popular shells bash, dash and zsh (latest versions the time of writing this article). I will add some detail on what drove me to do it, what it does, how it works and some examples.

Unit tests should be everywhere there is code. Tests materialise our expectations of the expected behaviour, prevent obscure bugs, generally induce more elegant designs and also serve as very effective up-to-date documentation. The oft-touted benefits are well worth it.

An area commonly overlooked in unit testing is when the code is “just a script”. This is unfortunately a common misconception, code is just code and should therefore be treated as such, with solid engineering principles and rigorous testing. Here’s a thought experiment: try to tell yourself what the differences between a ‘script’ and ‘proper code’ really are. Is it length? There might be a short piece of code that configures all your company’s firewall rules or backups and that is quite an important piece of logic, isn’t it? Is it criticality? Working non-critical code tends to end up included in critical systems and by extension becomes ‘mission-critical’ as well. Is it the language it is coded in? Most languages are pretty much logically equivalent (and Turing complete) so if something coded in Python is translated into Java to do the same thing its very essential nature has not changed at all. Is it necessity? We could go on. Code is just code, and it can and should be tested.

Unit testing for bash and shell scripts

Testing bash and shell scripts is unfortunately not that common. As discussed, shell scripts should still be tested thoroughly. There are plenty of shell testing libraries out there, usually bash-specific implementations, with different licenses. They are mature and well tested implementations. However, I was on the lookout for an Apache 2.0 licensed one that was simple, with no dependencies (such as the latest C++ compiler!) and I could not find one that suited my taste. One never knows anywhere near all there is to know about shell programming (trust me, you do not, specially when taking into account different implementations and so on) so I set about writing one myself that had the outlined characteristics which would also help me to learn more about shell coding.

Main features

The specific characteristics of provashell are as follows:

* Be unit-tested itself – Using plain shell test structures to do the assertion tests themselves. One of course can test the basic assert function and then leverage that tested function to check the other assertions, but I wanted to avoid false positives and keep concerns separate. Using provashell’s own test assert functions to test itself results in more elegant code but is potentially confusing when failures occur due to cascading effects. 

* Be as POSIX-compliant as possible – To that effect, the library has been tested in the latest (as of writing) versions of bash, zsh and dash, the latter being quite POSIX-strict. While bash-isms can be very practical when coding and in interactive shell sessions, cross-shell testing is a good code quality exercise which forces engineers to double check lots of assumptions and shortcuts, generally leading to better scripts. Even though I much prefer to use zsh for interactive sessions (specially when paired with toolchains such as the genial oh my zsh), once you have the mindset of shell implementations being real programming languages, it is fairly easy to mentally switch to bash or -even better-, POSIX shell ‘mode’ when writing persistent scripts. Such mental gymnastics will greatly help if found working on an old or unfamiliar system, with only ksh installed or something like that.

* Run no user-supplied code – This is an important security characteristic. The very first version of provashell used eval to run assertion checks in a flexible way, this resulted in elegant code but it also meant that test data could include shell code that could be run by the test framework. This is insecure and should be avoided if there are other solutions at hand, even if they are a bit more complex or less elegant. The latest version does not use eval anywhere in the code so it will not execute any user o lib user code, except for running the configured setup and teardown functions, of course. Whatever happens in those functions is up to the test developer. In any case, automated unit tests should never include user-supplied or variable data, to significantly low the risk of attacks using Continuous Integration systems or any automated test routines. Following that strategy, provashell does not run any user-supplied code other than the configured setup, teardown functions and the declared tests, and the assertion functions do not execute any external code to the best of my knowledge (you should check the code yourself anyway, grep -R eval src is your friend here).  It goes without saying that test shells should pretty much never be run as root, unless there is a very good reason (which there isn’t).

* Do not do (too much) funky stuff – Try to be as simple as possible and reuse code wherever feasible, so there is little repetition in the test library. It should also be easy to read and understand by any reasonably experienced shell coder. It is worth stressing again that shell scripts are real code and should be treated as such at all times.

* Use annotations to identify tests – Tests can be named any way the developer wants. I like annotations for tests because even though appending ‘Test’ to tests is a really simple convention, sometimes it adds extra cruft to test names in a context where it is not actually needed. For instance, the test called ‘errorIsReturnedOnNetworkTimeout’ is quite self-explanatory and easily understood in a test class context. Having ‘errorIsReturnedOnNetworkTimeoutTest’ does not add much to the definition and extends the name needlessly. It is of course a matter of style and it could argued that adding annotations to tests adds the same cruft, just in a different place. In any case, provashell uses annotations to identify tests and related functions which work well and are simple to use. Here’s a summary diagram of all the supported annotations (they need to go in a bash comment line):

provashell annotations
Function annotations defined by provashell

Yeah! Give me some shell test examples

Usage is pretty straightforward and can be demonstrated with an example. Imagine we have a function we want to test that counts the number of a’s in a text passed as a parameter. It is probably not a very good implementation but is a good enough example:

countAs() {
	c=$(printf %s "$1" | sed 's/[^a]*//g' | tr -d '\n' | wc -m)
	return "$c"

We then have two tests like this:

countAsNormally() {

	countAs 'a'
	assertEq "Does not count one 'a' in a string with only one 'a'" 1 $?

	countAs 'b'
	assertEq "Does not count zero a's in a string with no a's" 0 $?

	countAs 'aaa'
	assertEq "Does not count three straight a's when they are there" 3 $?

	countAs 'abab'
	assertEq "Does not count two a's in a string when they are there" 2 $?


countAsEdgeCases() {
	countAs ''
	assertEq 'Does not count empty string correctly' 0 $?

Once we have defined the tests we need to source the library like this (using whatever path we have for the provashell file):

. provashell

This will run the provashell code within the current context, causing the tests to be executed as expected, including the annotated pre and post functions.

Complete documentation

Extensive docs and examples can be found at the GitHub provashell project page. In any case, the source is quite short and readable.


The provashell library uses the Apache 2.0 license. Pull requests are welcome so fork away!