In end-to-end (E2E) automated testing systems, there are several challenges that keep automation teams busy. One major challenge is the ability to combine tests with a continuous integration platform. E2E tests can be slow, difficult to maintain, and easily broken on the long path through the twists and bends of the tested system (front-end, back-end,... (more...)
The post Improving Test Efficiency with AWS Lambda appeared first on Duda Blog.
In end-to-end (E2E) automated testing systems, there are several challenges that keep automation teams busy. One major challenge is the ability to combine tests with a continuous integration platform. E2E tests can be slow, difficult to maintain, and easily broken on the long path through the twists and bends of the tested system (front-end, back-end, database, interservices communication, etc.).
It’s no surprise, then, that many teams struggle to keep their test suite execution time in the minutes zone, often drifting into hours. Delaying the development process until all the tests end is counterproductive. So, how do companies solve this problem? They split the tests into a “sanity check” that includes minimal coverage, and push most tests to a once-a-day execution.
At Duda, we’ve developed an infrastructure that enables us to execute the full automation suite in 10 to 15 minutes (and even faster). We achieved this by carrying out tests on
AWS Lambda
.
Before we dive into how we execute tests inside a Lambda, let’s address the foundation upon which this is even feasible. If your tests are not stronghold stable, this solution does not help. In order to go quickly, you have to go well. See this post
about how to get your tests super stable and unbreakable with the WWHB design pattern. Only strong tests can withstand the speed of Lambda.
First, a few words about Lambdas
As defined in Wikipedia, “AWS Lambda is an event-driven, serverless computing platform provided by Amazon as a part of the Amazon Web Services.”
Let’s break this down bit by bit.
Serverless
: A Lambda is a container running Amazon Linux that can execute code that we write. The advantage over traditional servers is that we don’t maintain the server. We just ask it to execute the code and Amazon takes care of the rest. It handles all the administration and scalability. From our point of view, we treat a Lambda as a snippet of code that we can call as part of our system.
Event-driven:
As Lambdas are part of AWS, they connect with Amazon S3 buckets, Amazon DynamoDB tables, and other Amazon cloud services. This enables a Lambda to trigger after a database update or a file uploaded to S3.
The code running in the Lambda is referred to as a
Lambda Function
. A term taken from programming languages that can treat code function as a first class object. This architectural approach can create a more decoupled system and thus one that is more modular and robust.
AWS Lambdas can act in a similar way by extracting your functionality to an external computing platform with minimal connections to your core product. By extracting your code, Amazon can easily invoke the Lambda, returning its result back to your code and shutting down. If you need to scale this code, Amazon takes care of it by invoking more and more Lambdas simultaneously, like an army of working ants.
A Lambda does have has its limitations. Its disk space is limited and the time it takes the Lambda to execute your code is also limited. So, we have to keep the code small and quick.
Now, let’s talk about Selenium in Lambdas
At Duda, we write Selenium tests using Java and TestNG as a test executing framework. We have more than 1000 tests in our suite, and each test can be autonomously executed (as discussed in the
post
I mentioned above).
In order to execute the test inside a Lambda, we have to get the test code into the Lambda. We need to have Chrome and Chromedriver here, and finally, we have to execute the test. Let’s split this topic; first, to solving the Chrome and Chromedriver; then, to considering how to solve the test code itself.
The easy part is getting Chrome browser and the Chromedriver executable into the Lambda before execution. As mentioned above, you can write code to be executed in Lambda, aka Lambda function. You’ll need to upload the latest version of Chromedriver for Linux and a special version of headless Chrome for AWS Lambda (available in a
dedicated GitHub project
) to a location in the AWS S3 storage. Your Lambda function will begin by downloading these artifacts to the local storage.
To get the test code into the Lambda, you also have to have it in the S3 storage. You can achieve this by compiling all the test code infrastructure into a single executable (at Duda, a single JAR (Java ARchive)). The entry point must get a test name to execute as an argument. In our case, we built a Main class as an entry point and get the test name with TestNG framework. It is the single test executed in the Lambda. This means the whole JAR contains all the tests, and only one is executed in each Lambda. By utilizing the power of parallel execution with Lambdas, we can decrease the total suite time dramatically.
Now we need to invoke the Lambda function, which in turn will pull the Chrome, Chromedriver and test JAR from S3 and finally execute the test. We implement it by a class we call ExecuteLambdaTest. It sends a JSON object using the AWS API kit for Java, invoking the Lambda with a single test name to execute. We can run this functionality with multiple threads, thus executing multiple Lambdas at once. This class will invoke the Lambda and wait until it’s finished returning a JSON object as a response.
All that is left is to gather all the test reports from all the Lambdas back to the S3 storage to show the final status of the suite.
The three components for orchestrating all this: S3 storage, ASW Lambda and Jenkins
After implementing the basic mechanism, you can add many more improvements, such as a test retry after failure, unified reporting system, reuse of the Lambda container, and more.
Highlight of the testing improvements
This system resolves several issues, and leads to many testing improvements, including:
1. Short distance between Selenium & Browser
With Selenium Grid, you execute the test on a CI or corresponding server to execute the test code. It sends its actions using RemoteWebDriver to the Hub server, which tries to share the load over nodes holding the browser. It’s a very long distance for each web action to go.
In traditional testing, the distance each web action must cover is long and takes time.
With our system, the test is executed inside the Lambda, and the browser sits on the same machine, giving a much better response for each action and speeding up each test.
By using the Lambda, both the time and the distance are much shorter.
2. Parallel scalability
With Selenium Grid, you are limited to the amount of nodes you manage. With Lambda, you can invoke as many tests you want at once. The bottleneck is the server on which you are running the tests, which must be able to handle the amount of simultaneous users.
3. Parallel precision
With Selenium Grid, the hub manages a queue of tests and you cannot intervene. After some testing, we found that the Grid does not cope with high loads, and we don’t get the parallel tests we expect. Because we’ve implemented our own parallel test executor, we know exactly what holds us back, and how to break barriers.
At Duda, we’ve implemented our system writing all the code ourselves for better control of execution speed. Our implementation of Selenium in Lambda significantly improved execution time without damaging stability, which is exactly what we intended.
I hope this blog post inspires your team to try it out and helps your automation play a more significant role in the dev process.