Podman is a tool which runs, manages and deploys containers under the OCI standard. Running containers with rootless access and creating pods (a Pod is a group of containers ) are additional features of Podman. Note, however, that checkpointing only works for containers run by root users because it requires root privileges. This article describes and explains how to use checkpointing in Podman to save the state of a running container for later use.
Checkpointing Containers : Checkpoint / Restore In User-space, or CRIU, is Linux software available in the Fedora Linux repository as the “criu” package. It can freeze a running container (or an individual application) and checkpoint its state to disk (Reference : https://criu.org/Main_Page). The saved data can be used to restore the container and run it exactly as it was during the time of the freeze. Using this, we can achieve live migration, snapshots, or remote debugging of applications or containers. This capability requires CRIU 3.11 or later installed on the system.
# podman container checkpoint <containername>
This command will create a checkpoint of the container and freeze its state. Checkpointing a container will stop the running container as well. If you do podman ps there will be no container existing named <containername>.
You can export the checkpoint to a specific location as a file and copy that file to a different server
# podman container checkpoint <containername> -e /tmp/mycheckpoint.tar.gz
# podman container restore --keep <containername>
the –keep option will restore the container with all the temporary files.
To import the container checkpoint you can use:
# podman container restore -i /tmp/mycheckpoint.tar.gz
Live Migration using Podman Checkpoint
This section describes how to migrate a container from client1 to client2 using the podman checkpoint feature. This example uses the https://lab.redhat.com/tracks/rhel-system-roles playground provided by Red Hat as it has multiple hosts with ssh-keygen already configured.
The example will run a container with some process on client1, create a checkpoint, and migrate it to client2. First run a container on the client1 machine with the commands below:
podman run --name=demo1 -d docker.io/httpd podman exec -it demo1 bash sleep 600& (run a process for verification ) exit
The above snippet runs a container as demo1 with the httpd process which runs a sleep process for 600 seconds ( 10 mins ) in background. You can verify this by doing:
# podman top demo1 USER PID PPID %CPU ELAPSED TTY TIME COMMAND root 1 0 0.000 5m40.61208846s ? 0s httpd -DFOREGROUND www-data 3 1 0.000 5m40.613179941s ? 0s httpd -DFOREGROUND www-data 4 1 0.000 5m40.613258012s ? 0s httpd -DFOREGROUND www-data 5 1 0.000 5m40.613312515s ? 0s httpd -DFOREGROUND root 88 1 0.000 16.613370018s ? 0s sleep 600
Now create a container checkpoint and export it to a specific file:
# podman container checkpoint myapache2 -e /tmp/mycheckpoint.tar.gz # scp /tmp/mycheckpoint.tar.gz client2:/tmp/
Then on client2:
# cd /tmp # podman container restore -i mycheckpoint.tar.gz # podman top demo1
You should see the output as follows:
USER PID PPID %CPU ELAPSED TTY TIME COMMAND root 1 0 0.000 5m40.61208846s ? 0s httpd -DFOREGROUND www-data 3 1 0.000 5m40.613179941s ? 0s httpd -DFOREGROUND www-data 4 1 0.000 5m40.613258012s ? 0s httpd -DFOREGROUND www-data 5 1 0.000 5m40.613312515s ? 0s httpd -DFOREGROUND root 88 1 0.000 16.613370018s ? 0s sleep 600
In this way you can achieve a live migration using the podman checkpoint feature.
I think your definition of Live Migration needs to be revised. There’s no transfer of state or any other indication that the container is handed off live to the destination host.
This is cold migration and not live migration.
podman-commit followed by any safe copy method and podman run can be used as well.
https://podman.io/getting-started/checkpoint – as per the official docs this seems ok
Well, the process state an is preserved, including the operating system timer that is supposed to wake up the sleeping thread.
The wording “live migration” stems from podman’s own documentation – so it’s not Ashutosh’s definition that needs to be revised. It’s podman’s docs, which happen to be a bit sloppy here.
I agree – this is not a live migration. This is a freeze, store, restore, unfreeze – it’s a necessary part of a migration, but calling it a live migration without demonstrating the transfer of IP address ownership, sockets and thus TCP connections preservation, file system persistence and shared memory is maaaaybe a bit proud of the podman documentation. There’s significant parts of the puzzle missing, like setting up a shared storage, the software-defined IP networking and so on. It’s a bit like “here, we draw a circle (show that sleep works). Now, just draw the rest of the hooting owl!”.
600 seconds is 10 minutes, not 6, which would be 360 seconds.
Yep. Good catch
Great Article! Had no idea this was even possible. I would expect the ‘ELAPSED’ values would be different when running top on the 2nd client. Thanks 🙂
Thanks for the article. It brings me back to the days of OpenVZ when checkpoint, live migration and restore were new. That was 2006. Good times.
Thanks for your kind appreciation !!
It should be mentioned that this only works for containers run by root users as checkpointing requires root privileges.
So, rootless containers can not be checkpointed.
Yes Alexander, Agree +1, Not sure if I can edit this article now . Thanks
@Ashutosh: You won’t be able to edit it directly now that the article has gone live, but the Fedora Magazine editors can. If you want to provide some text and instructions for where to place it, I can get it updated.
@Gregory : If you can add a small note as suggested by Alexander, Podman only works for containers run by root users as checkpointing requires root privileges.
I’ve added the following sentence to the first paragraph right after rootless containers are mentioned.