MLCube and Podman

MLCube is a new open source container based infrastructure specification introduced to enable reproducibility in Python based machine learning workflows. It can utilize tools such as Podman, Singularity and Docker. Execution on remote platforms is also supported. One of the chairs of the MLCommons Best Practices working group that is developing MLCube is Diane Feddema from Red Hat. This introductory article explains how to run the hello world MLCube example using Podman on Fedora Linux.

Yazan Monshed has written a very helpful introduction to Podman on Fedora which gives more details on some of the steps used here.

First install the necessary dependencies.

sudo dnf -y update
sudo dnf -y install podman git virtualenv \
                    policycoreutils-python-utils

Then, following the documentation, setup a virtual environment and get the example code. To ensure reproducibility, use a specific commit as the project is being actively improved.

virtualenv -p python3 ./env_mlcube 
source ./env_mlcube/bin/activate
git clone https://github.com/mlcommons/mlcube_examples.git 
cd ./mlcube_examples/hello_world
git checkout 5fe69bd
pip install mlcube mlcube-docker
mlcube describe

Now change the runner command from docker to podman by editing the file $HOME/mlcube.yaml so that the line

docker: docker

becomes

docker: podman

If you are on a computer with x86_64 architecture, you can get the container using

mlcube configure --mlcube=. --platform=docker

You will see a number of options

? Please select an image: 
  ▸ registry.fedoraproject.org/mlcommons/hello_world:0.0.1
    registry.access.redhat.com/mlcommons/hello_world:0.0.1
    docker.io/mlcommons/hello_world:0.0.1
    quay.io/mlcommons/hello_world:0.0.1

Choose docker.io/mlcommons/hello_world:0.0.1 to obtain the container.

If you are not on a computer with x86_64 architecture, you will need to build the container. Change the file $HOME/mlcube.yaml so that the line

build_strategy: pull

becomes

build_strategy: auto

and then build the container using

mlcube configure --mlcube=. --platform=docker

To run the tests, you may need to set SELinux permissions in the directories appropriately. You can check that SELinux is enabled by typing

sudo sestatus

which should give you output similar to

SELinux status:                 enabled
...

Josphat Mutai, Christopher Smart and Daniel Walsh explain that you need to be careful in setting appropriate SELinux policies for files used by containers. Here, you will allow the container to read and write to the workspace directory.

sudo semanage fcontext -a -t container_file_t "$PWD/workspace(/.*)?"
sudo restorecon -Rv $PWD/workspace

Now check the directory policy by checking that

ls -Z

gives output similar to

unconfined_u:object_r:user_home_t:s0 Dockerfile
unconfined_u:object_r:user_home_t:s0 README.md
unconfined_u:object_r:user_home_t:s0 mlcube.yaml
unconfined_u:object_r:user_home_t:s0 requirements.txt
unconfined_u:object_r:container_file_t:s0 workspace

Now run the example

mlcube run --mlcube=. --task=hello --platform=docker
mlcube run --mlcube=. --task=bye --platform=docker

Finally, check that the output

cat workspace/chats/chat_with_alice.txt

has text similar to

Hi, Alice! Nice to meet you.
Bye, Alice! It was great talking to you.

You can create your own MLCube as described here. Contributions to the MLCube examples repository are welcome. Udica is a new project that promises more fine grained SELinux policy controls for containers that are easy for system administrators to apply. Active development of these projects is ongoing. Testing and providing feedback on them would help make secure data management on systems with SELinux easier and more effective.

7 Comments

Angel Yocupicio

Thank you very much for create this tool MLCube open source container. It is very interesting. I think that will include on one tutorial tune up for Fedora 36 coming soon.

April 18, 2022
- Benson Muite
  
  Thanks for your feedback, further tutorials would be great. I am not a developer of MLCube, a new user.
  
  April 19, 2022
Michael Rivard

In order for rootless podman containers to access the host’s NVIDIA GPU, I had to run (once, on the host):

sudo chcon -t container_file_t /dev/nvidia*

. (Search here for

Unknown Error

.)

I have never seen this mentioned anywhere else, not even in NVIDIA’s own documentation or user forums.

April 18, 2022
- Benson Muite
  
  Thanks for this. Finding good desktop defaults is helpful, but SELinux also offers many opportunities for customization which are useful when working with sensitive and/or valuable data for which access needs to be controlled.
  
  April 19, 2022
Jonatas Esteves

On the part about setting SELinux labels for the workspace directory, the commands

semanage

and

restorecon

should not need

sudo

.

Also, I think this should be reported as an issue to MLCube. It could have done this automatically for you (or as an option) by just passing a

:z

or

:Z

flag when mounting the directory.

April 19, 2022
- Benson Muite
  
  Thanks for your feedback. Passing :Z is one of the suggested options that could be implemented https://github.com/mlcommons/mlcube/issues/205 Suggestions on how this would work best in your workflows would be greatly appreciated.
  
  April 19, 2022
sid

Thanks for the info

April 21, 2022

Like this:

Benson Muite

7 Comments

Angel Yocupicio

Benson Muite

Michael Rivard

Benson Muite

Jonatas Esteves

Benson Muite

sid

Subscribe to Fedora Magazine via Email

Contribute to the Magazine

MLCube and Podman

🧱 Building better initramfs: A deep dive into dracut on Fedora & RHEL

🔧 Unlocking system performance: A practical guide to tuning PCP on Fedora & RHEL

🔧 Deep dive into sosreport: understanding the data pack layout in Fedora & RHEL

Like this:

Benson Muite

7 Comments

Angel Yocupicio

Benson Muite

Michael Rivard

Benson Muite

Jonatas Esteves

Benson Muite

sid

Subscribe to Fedora Magazine via Email

Contribute to the Magazine