Fedora’s openQA Cloud Deployment

Photo by Bofu Shaw on Unsplash (cropped, recolored)

Fedora benefits from a long-standing, comprehensive, and robust openQA deployment used in release validation testing and testing updates. A recent project extended Fedora’s openQA infrastructure to use containers and cloud resources. Deploying openQA in the cloud protects against physical hardware failures and simplifies the addition of resources to scale up future testing. Another benefit of the cloud deployment is that it makes it easier for anyone to run, test, and experiment with Fedora’s openQA. Are you already familiar with openQA? Try out the cloud deployment examples at: https://pagure.io/ansible-openqa-cloud.

OpenQA runs comprehensive end-to-end testing on operating system images. It loads images into a virtual machine and sends commands, clicks buttons, moves the cursor, types input, and visually compares the results with expected images. Anyone who has installed Fedora Linux can appreciate that the process takes a little while and needs a bit of manual attention. Now multiply this by the many images generated by release engineers each day; by different flavours for each image; by different architectures; plus add new updates every hour more or less, and you can understand openQA’s tagline “Life is too short for manual testing!” Here is just one example of an openQA worker testing Fedora Linux 41:

Fedora’s openQA versus upstream openQA

A good point of entry for developers wishing to extend, debug or otherwise experiment with Fedora’s openQA is to understand how Fedora’s deployment relates to the upstream project. There are two main upstream repositories:

  1. openQA which handles the web user interface, websockets, livehandler, scheduler, a PostgreSQL database and various background tasks like asset cleanup; and
  2. os-autoinst the backend responsible for running the tests and reporting their results.

Fedora relies on these upstream repositories and includes them in the openqa and os-autoinst packages respectively. When you install these packages, you can find the upstream libraries at: /usr/share/openqa/lib/OpenQA and the upstream scripts at: /usr/share/openqa/script. The new cloud deployment provides access to these libraries and scripts inside the openqa-webserver container.

Similarly, find the backend code of os-autoinst inside any instance of the openqa-worker container at: /usr/lib/os-autoinst. Here you can see the inner workings of how os-autoinst starts the QEMU virtual machines for running tests.

The new openqa-database container provides access to the PostgreSQL database. Use psql to easily inspect and modify your local version of the database; for example:

# psql -U postgres -d openqa
psql (15.1)
Type "help" for help.
openqa=# \dt
List of relations
Schema | Name | Type | Owner
--------+---------------------------------------+-------+-----------
public | api_keys | table | geekotest
public | assets | table | geekotest
public | audit_events | table | geekotest
...

Since the cloud deployment allows unique worker containers to be brought up and down frequently and essentially without limit, deleting workers from the database directly can sometimes be the quickest way to get rid of the many more ephemeral workers than would normally exist outside of a containerized deployment.

Fedora-specific tests

Where Fedora, and every distribution that uses openQA, is unique is in its repository of tests to run on operating system images. Fedora’s tests are located in the os-autoinst-distri-fedora repository. Although the content is unique, openQA requires that tests be located in a specific “tests” directory: /var/lib/openqa/share/tests/fedora.

Find these test repositories in both of the new openqa-webserver and openqa-worker containers. In the cloud deployment a small service, openqa-test-update, runs every six hours to pull any changes to the tests. If you would rather test your own fork of os-autoinst-distri fedora, stop the test update service and pull changes from a forked repository instead.

Fetching and Scheduling images to test

Another unique feature of Fedora’s openQA deployment is how it schedules operating system images for testing. At the most rudimentary level, it’s possible to just manually add images to the directory where openQA and the job settings expect to find them: either /var/lib/openqa/share/factory/iso or /var/lib/openqa/share/factory/hdd . For example, inside the openqa-webserver container, try fetching an ISO image by running:

curl https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20240612.n.0/compose/Server/x86_64/iso/Fedora-Server-netinst-x86_64-Rawhide-20240612.n.0.iso --output /var/lib/openqa/share/factory/iso/Fedora-Server-netinst-x86_64-Rawhide-20240612.n.0.iso

If this particular ISO isn’t available, get a new link from a current test in production by clicking through the colored dot to its Settings tab.

A successful openQA test appears as 
a single row with two columns.  The first column shows the test's name, "base_package_install_remove", and the second column shows a green dot.

Once you have downloaded an ISO and placed it in its correct directory, schedule a test with the general upstream tool openqa-cli:

openqa-cli api -X POST isos \
ISO=Fedora-Server-netinst-x86_64-Rawhide-20240612.n.0.iso \
DISTRI=fedora \
VERSION=Rawhide \
CURRREL=40 \
FLAVOR=Server-boot-iso \
ARCH=x86_64 \
BUILD=Fedora-Rawhide-20240612.n.0 \
--apikey 1234567890ABCDEF --apisecret 1234567890ABCDEF \
TEST=install_default

Luckily, we don’t have to do this hard work to test every image. Instead, Fedora uses its own, customized command line tool, fedora_openqa, to figure out all the necessary setting before scheduling a test. Even better, fedora_openqa listens to messages from release engineering and automatically schedules tests in response. Since the messages are public, no particular access keys are required for testing.

In the cloud deployment, a special container, openqa-dispatcher runs fedora_openqa and listens for these messages. Detailed logs of all the messages consumed are available this container in the /fedora-messaging-logs directory.

Testing Fedora Updates

A more complex scenario to handle is testing updates which need to be applied to existing operating system images. Fedora handles this challenge with its customized createhdds application that generates base disk images using the host’s kernel image. Save the disk images in /var/lib/openqa/share/factory/hdd/fixed to prevent openQA’s asset cleanup minions from deleting the images. openQA then applies incoming updates to these base disk images for testing.

One difficulty of the cloud deployment project was that some cloud instances, being virtual machines themselves, could not create further “nested” virtual machines using /dev/kvm. Without nested virtualization, it’s not possible to run the openqa-worker or createhdds efficiently. Make sure that /dev/kvm is available on any cloud instance you are using. For AWS this meant we used EC2 instances in the “metal” family (e.g. c5n.metal and c6in.metal) for workers and the createhdds application.

The cloud deployment provides a service openqa-createhdds to build these disk images once per week, in a process that may take several hours and use around 60 GB of the host’s disk space. As a more casual approach, you can disable the openqa-createhdds service and many of the openQA tests will continue to pass.

Conclusion

Before this project, we could only run openQA using farms of dedicated physical machines in a Fedora data center. Those machines have to be maintained and kept up to date, and increasing our capacity means acquiring new machines and finding the space to house them. This project has proved that we can potentially break the tight link to specific clusters of hardware and make our openQA deployments much more flexible — allowing us to scale up or down and maybe add new instances for different purposes much more easily than we could before. It also makes it much easier for individuals to deploy their own openQA instance for their own purposes.

In the future, we will consider migrating the official Fedora openQA instances to use this deployment method, starting with the staging instance. Eventually, we may retire the dedicated openQA hardware clusters entirely. Special thanks to Meta for sponsoring this work and to Collabora for implementing it. If you’re interested in helping out, you can drop by the Fedora Quality room on Fedora Chat and say hi.

For Developers

1 Comment

  1. Malcolm Lawrenson

    Hi to all the teams that put Fedora together, I have installed Fedora40 on my 2 laptops HP Envy 15″ x360 (NVIDIA)new in 2022, & HP EliteBook x360 new in 2018, and have had a wonderfully smooth experience of install and use on both,this was after trying many other flavours of linux but which well never cut the muster’d.I don’t know if this is the right place to say thank’s but here it is what a wonderful system you have created.I am almost 68 not exactly right up there with tech savvy’ness but I do repair computers and know my way around a lot of stuff ,
    but I have given up the constant paying through the nose for microsoft and thought it was about time to teach myself the ins and outs of Linux and after many tries with other flavours I have finally made the best discovery amongst Repo’s FEDORA. Once again many many thanks.

Comments are Closed

The opinions expressed on this website are those of each author, not of the author's employer or of Red Hat. Fedora Magazine aspires to publish all content under a Creative Commons license but may not be able to do so in all cases. You are responsible for ensuring that you have the necessary permission to reuse any work on this site. The Fedora logo is a trademark of Red Hat, Inc. Terms and Conditions