Continuous Deployment Archives - Automated Visual Testing | Applitools

What is Jenkins? How to Use Jenkins for CI/CD and Testing

Applitools Team — Fri, 04 Jun 2021 16:54:49 +0000

Jenkins is a popular open source automation server. It’s used to implement Continuous Integration (CI) and Continuous Delivery (CD) for any development project.

CI/CD, a key component of a DevOps strategy, allows you to shorten the development lifecycle while maintaining quality by automating tasks like testing. In short, a successful implementation will help you test more often and deploy faster with high quality.

Jenkins, as a Java-based CI server with strong community support and a huge plugin ecosystem, is a powerful tool for anyone looking to add CI/CD to their project.

History of Jenkins

Jenkins has a long history, stretching back to 2011, and its roots extend back even further to 2004. That was the year that Kohsuke Kawaguchi, a Java developer at Sun Microsystems, built an automation tool to help him answer the question of whether his code would break the build before he committed it. The open source tool he built, called Hudson, automated tests and builds and helped him understand whether his code would work much sooner.

In 2011, there was an issue between Oracle (which had acquired Sun Microsystems) and the open source project’s community, and the community created a fork named Jenkins. For a while both projects were updated independently, but Jenkins proved to be more popular and Hudson is no longer developed.

What Does Jenkins Do?

Jenkins helps you automate tasks for CI/CD. Once the Jenkins server is configured, you will be able to execute a series of automated tests and builds so that your code is always up to date and valid. Implementing CI/CD through a tool like Jenkins can greatly simplify the process of ensuring a high level of code quality and successful builds. It’s particularly powerful when large development teams are working on a single project, as traditional methods can result in a lot of conflicting code commits that require complex troubleshooting.

Before we dive further into the functionality of Jenkins, let’s take a moment to define continuous integration/delivery. Then we’ll discuss how Jenkins helps you achieve it.

What is Continuous Integration and Continuous Delivery (CI/CD)?

CI is a process that enables you to integrate code changes from multiple developers working on a single project quickly and repeatedly. When a developer commits code, it can immediately be tested. If the tests pass, the code can be integrated into the build, which can similarly be tested and immediately verified.

Automated tests are typically used alongside automated builds so that the process can be fast, effective and easily repeatable.

How Does Jenkins Help with Automated Testing?

Using Jenkins to implement a CI/CD process can dramatically improve your ability to test and ship a high quality product.

One of the chief benefits is the ability to rapidly discover and fix bugs. If newly committed code introduces a defect into the build, not only is it caught immediately, but it is easy to know whose code caused the error. The problematic code can then be isolated, updated and recommitted quickly.

Getting Started with Jenkins and Jenkins Pipeline

The Jenkins CI server can be run as a standalone application with a built-in Java servlet container, or as a servlet in other Java servlet containers such as Apache Tomcat.

The most popular way to manage CI/CD with Jenkins is through Jenkins Pipeline, which is a suite of plugins that help you integrate continuous delivery pipelines in Jenkins. Essentially, a Pipeline is a linked series of automated steps to be executed by Jenkins. Jenkins Pipelines are implemented in code which can be committed, so that it can be versioned and reviewed along with the rest of your code.

Here’s an example (from the Jenkins docs) of a CD scenario modeled in Jenkins Pipeline:

Pipelines can be written manually or through the newer Blue Ocean GUI. There is also an older “classic UI”, but if you want to use a UI Blue Ocean is recommended.

Extending Jenkins with Plugins

Jenkins is designed to be easily extended via plugins – and over the years a thriving community has created a huge plugin ecosystem. The strength of this community, and subsequently the size of the plugin library, is one of the best things about Jenkins.

You can find a plugin for just about anything you need, as there are already 1500+ plugins contributed by the community (and counting). This is a good place to note that Jenkins is very actively updated and open to community participation.

Many testing tools like Applitools have a plugin that can easily help get your testing and CI/CD working together. Check out the Applitools plugin for Jenkins to see how this works.

Should You Use Jenkins?

Jenkins is a popular open source tool for CI/CD that is free to use. While you may need some server administration skills to configure and monitor Jenkins, there are many advantages to consider. The Jenkins project includes a large plugin ecosystem, the community around it is thriving and it is actively developed. If that sounds appealing to you, then give Jenkins a look for your CI/CD needs.

Learn More

The post What is Jenkins? How to Use Jenkins for CI/CD and Testing appeared first on Automated Visual Testing | Applitools.

Test Management within Continuous Integration Pipelines

Angie Jones — Fri, 23 Apr 2021 15:35:30 +0000

Once upon a time there were several software development teams that worked on a fairly mature product. Because the product was so mature, there were thousands of automated tests – a mix of unit and web UI tests – that covered the entire product. Any time they wanted to release a new version of the product, which was only a few times a year, they’d run the full suite of automated tests and several of the web UI tests would fail. This was because those tests were only executed during the specified regression testing period, and of course a lot had changed within the product over the course of several months.

The company wanted to be more competitive and release more often, so the various development teams began looking into continuous integration (CI) where the automated tests would run as part of every code check-in. But…there were thousands of tests. And although there were so many tests, the teams were really careful about choosing which tests to automate, so they were fairly confident that all of their tests provided value. So, they ran all of them – as part of every single build.

It didn’t take long for the team to complain about how much time it took the builds to execute. And rightfully so, as one of the benefits they were hoping to realize from CI was fast feedback. They were sold a promise that they’d be able to check in their code and within only a few minutes they’d receive feedback as to whether their check-in contained breaking changes. However, each build took hours to complete. And once the build was finally done, they’d also need to spend additional time investigating any test failures. This became especially annoying when the failures were in areas of the application that different teams worked on. This didn’t seem like the glorious Continuous Integration that they heard such great things about.

Divide and Conquer

Having a comprehensive test suite is good for covering the entire product, however, it posed quite the challenge for continuous integration builds. The engineers looked at how they themselves were divided up into smaller agile teams and decided to break their test suite up to better reflect this division.

Each area of the product was known as a work zone, and if anyone was working on a particular part of the application, that was considered an active work zone. Areas that were not under actively development, were considered dormant work zones.

The builds were broken up for the respective work zones. Each build would share common tests such as build verification tests and smoke tests, but the other tests in a given build would be only the ones related to that work zone. For example, the Registration feature of the application was considered a work zone, and therefore there was a build that would only run the tests that were related to Registration. This provided nicely scoped builds with relevant tests and reduced execution time.

In additional to the various work zone builds, there was still the main build with all of the tests, but this build was not used for continuous integration. Instead, this build would run periodically throughout the day. This provided information about how changes may have impacted dormant work zones which did not have active builds running.

Assigning tests to work zones

All web UI tests lived in a common repository, regardless of the specific functional area. This allowed tests to share common utilities. The teams decided to keep this approach and use tagging to indicate which functional area(s) a given test covered. For example, for a test that verified a product listing, this test would be tagged for the “Shopping” work zone. And for a test that adds a product to a cart, this one spanned multiple work zones and was therefore tagged as “Shopping” and “Cart”. Tests that were tagged for multiple work zones would run as part of multiple builds.

To tag the tests, the teams used their test runner such as TestNG or JUnit and made use of the annotation feature of these runners.

@Test(groups={WorkArea.SHOPPING, WorkArea.CART})
public void testAddProductToCart()
{
    ...
}

Test runners also typically allow a means to configure which tests run. The team decided not to create these configuration files within the code repository because it did not allow for quick changes, as they’d need to check the change in, have it reviewed, etc. So, instead the configuration was done at the CI job level.

mvn test -Dgroups=cart

With this, if someone was checking in a feature that touched multiple work zones, they could quickly configure the build to pull in tests from all relevant zones. Also, it allowed for teams to change their build’s needs as their sprints changed. For example, the Shopping area may be an active work zone one sprint but a dormant work zone in the next. So, while the builds were focused on a specific work zone, they really were more aligned with the sprint team and their current needs at any given time.

Limitations

While this approach eliminated the complaints of the build being too slow or containing unrelated test failures, there were still limitations.

Bugs can be missed

By reducing the scope of the build, the team was not testing everything. This means that unintentional bugs in other work zones could creep in with a check in. However, to mitigate this risk, remember, the teams kept the main build which ran all tests several times a day. Initially they set this to run only once a day but found that wasn’t often enough. So, they increased this to run every 6 hours. If this build failed, it would be from a check-in made within the last 6 hours which helped narrow down the problem area.

Also, this system relied heavily on the tests being tagged properly. If someone forgot to tag a test or mis-tagged it, that would not be run as part of the appropriate work zone build. Usually these were caught by the main build and this gave an opportunity to fix the tagging.

Tests must be reliable

The web UI tests were not originally part of continuous integration. Instead they were run periodically throughout the year (during the dedicated regression testing time) on someone’s local machine. That person would then investigate the failures and could easily dismiss flaky tests that failed with unjust cause, unbeknownst to the rest of the team.

However, this sort of immaturity is unacceptable when a test needs to run as part of continuous integration. It has to be reliable. So before this new CI process could work flawlessly, the team had to invest time into enhancing the quality of their tests so that they only failed when they were supposed to.

Not every test failure is a show-stopper

The teams went through the very important process of identifying the most valuable tests to automate. Which would make you think that if any of them fail, the integration should be canceled. This sounds right in theory, but was different in practice.

Sometimes tests would fail, the team would investigate, then determine they still wanted to integrate the feature. So, they opened a bug, disabled the test, and integrated the feature.

Is this wrong? Why have the test if you’re going to still integrate in the event of a failure?

The team decided that the information was still valuable to them. Knowing this gave them information about the risks they were taking, and they could discuss as a team if they were willing to take the risk of introducing this new feature knowing that it breaks an existing feature. In some cases, it was worth it, and they opened bugs to eventually address those failures.

That’s the role of tests: to provide the team with fast feedback so that they can make informed decisions.

Happily Ever After

Preparing for continuous integration certainly took a fair amount of investment. The team learned a valuable lesson: you don’t just jump into CI. A proper testing strategy is needed to ensure you’re able to realize the benefits of CI, namely fast and reliable feedback. After experiencing bumps and bruises along the way, the team finally figured out a system that worked for them.

The post Test Management within Continuous Integration Pipelines appeared first on Automated Visual Testing | Applitools.

Production Deploy with Every Check-In? You Gotta Go TWO Low!

Paul Grizzaffi — Mon, 11 Nov 2019 20:11:37 +0000

Much is made of the dream flow from continuous integration to continuous testing to continuous delivery to production. Conceptually this sounds great! A developer checks in their code, the code is built, unit tests are executed, the code is deployed to a testing environment, and then the automated test scripts are executed. If nothing fails in the preceding steps, the code is then automatically deployed to the production environment. The users/clients/customers potentially get multiple releases per day, but also potentially get multiple features or fixes per day. We get all of this with no human delay or interaction once the code is checked in. Also, we get all of this with no human oversight and little human insight.

What could possibly go wrong? What, indeed!

Capturing Visual Issues

Let’s assume we have a production system with which humans will interact. One thing that could go wrong is the user interface could be all “messed up”. Perhaps all the text and the background are now royal blue. Blue on blue is difficult to read, to say the least; our users would not be able to use the site. It could be worse, though: what if our #1 competitor’s well-known logo and product color is royal blue. This would be awkward at best. At worst? Perhaps we are infringing on a competitor’s trademark and are now subject to legal action.

Oh, wait, automated visual testing would have caught that. Correct! Automated visual testing would have caught that plus a slew of other visual and formatting issues. But, like most testing and automation techniques, visual testing can’t catch everything a human would because automated visual testing isn’t, well, human…

Capturing Timing Issues

What if a transaction takes longer than it should or than it did last time? Oh, well, we can check for that in our scripts. Cool, cool. Do we have test scripts that pause mid-flow for 30 minutes because one of our kids threw a handful of Legos at the other and it was the end of the world as we know it? Uh…oh…no…, we can add a script for that. Can we add scripts to pause at every possible place in the flow? Probably not; we probably can’t even think of every possible place in the flow. Spoiler alert: our users do these things, so do their kids, pets, and roommates.

Can we think of everything? Likely not. We’re human.

Cost of Change and Cost of Failure

Now, there’s nothing inherently wrong with the process I outlined above. As always, it’s about being responsible. If developer-check-in-to-prod works for your organization, then you should do it…provided you understand the risk. This risk manifests in two big buckets: cost of change and cost of failure.

The cost of change is something that’s discussed a lot. What does it take to debug an issue, fix it, then run the fix through the pipeline to get it to production, and therefore, to the users? That’s a very basic cost of change calculation: the cost of the effort required to determine a fix and get that fix out of the door to the user. Oh wait, while we’re addressing this issue, we’re not addressing other issues and we’re not working on new capabilities. These are costs as well; they’re referred to as opportunity costs, i.e. the cost of not addressing Thing B because we are working on Thing A. The math on the cost of change starts to become more involved because we’re now accounting for the impact of addressing an issue.

Let’s take the story further; consider a retail company. Most retail companies that have a brick-and-mortar storefront also have an eCommerce capability via a website and a mobile app. The cost of change on the website is generally low. When a customer encounters an issue, the retailer may lose that customer or perhaps it will give a discount to that customer to make up for the poor experience, but the retailer can generally minimize the number of lost customers by quickly deploying a fix precisely because the cost of change is low.

Contrast that cost with the cost of change for this retailer’s point of sale (POS) system. Changes to a POS system have downstream impacts that aren’t immediately obvious. Consider a retailer that has a large number of stores but that has a very small per-store number of employees, say, ten employees per store. Each employee must be trained on new or changed POS features; the number of training and ramp-up hours per change may not be insignificant.

Now suppose we found an issue with the POS release and the resolution for this issue requires a change in the workflow. Though the training cost will likely not be as large as the original cost, there is still a non-zero training cost to deploying the new software for use. Of course, when the number of employees that use the POS system is larger, the training and retraining costs are larger as well. As we can see, the actual cost of change can be higher than it appears on the surface.

Imagine further that this POS software has an outage…on Black Friday. There will possibly be 10’s or even 100’s of people in a store, all of whom are all inconvenienced by the outage, and many of whom will be expressing their frustration on social media. Here, we see that there is a cost involved that’s in addition to the cost of change; issues, problems, and failures have a cost as well.

Framing Costs of Failure Issues

The cost of these issues, problems, and failures is something that seems to be discussed less frequently. What is the cost of failure? It’s the cost to the business of an issue, i.e. failure, in the system. In some cases, it’s probably rather small. If we’re producing a 99-cent mobile game and a user encounters an issue, we may lose that user but, in the grand scheme of things, we may not really care because we didn’t lose appreciable money. It’s ugly to say, but when discussing business, it’s generally accurate. We may lose that customer and perhaps some of their friends, but we’re probably not hinging our house payment on that one game sale.

Contrast the 99-cent game example with a package delivery company. This company will likely have a website for its customers to arrange pickups and deliveries. The cost of change is probably relatively low if there’s an issue on the website; the cost of failure, however, could be catastrophic. What if legal documents for a corporate sale were delayed, or worse, sent to a competitor by mistake? What if an undiscovered error in our system causes the company to fail to deliver life-critical medical supplies, or, say, a donor organ? Catastrophic, to be sure. The cost of this failure is unacceptably high.

Even in the cases where there is no loss of life, there can be tangible financial costs to failure as well. Companies often have contracts with their clients that contain service level agreements (SLAs). Violation of the SLAs can require penalty payments or a return of previously paid fees when our systems are “insufficiently performant”.

Conclusion – Assess Your Costs and Risks

As with most concepts in technology, there’s nothing inherently wrong with “check-in and deploy to production”. Companies are doing this and seem to be content-to-happy with the business results. That said, in all circumstances, we must be responsible with our automated approaches. We must understand the risks we are undertaking, and the risk tolerance we have. Without these self-assessments, we won’t be prepared for the consequences for the risks we unknowingly undertake.

Like this? Catch me at an upcoming event!

The post Production Deploy with Every Check-In? You Gotta Go TWO Low! appeared first on Automated Visual Testing | Applitools.