
Running podman-compose with Nvidia GPU on Gentoo (systemd)
Running podman-compose with NVIDIA GPU support on Gentoo can be a powerful way to manage containerized applications while leveraging GPU acceleration for tasks like machine learning, video transcoding or graphics rendering. However, setting this up requires careful configuration of both the GPU driver and podman-compose services to ensure seamless integration. This guide will walk you through the process of enabling GPU support in podman-compose on Gentoo, step by step, so you can fully utilize your system's capabilities.
Table of Contents
In no event, unless required by applicable law or agreed to in writing will I be liable to you for damages, including any general, special, incidental, or consequential damages arising out of the use or inability to use the information, commands, scripts and snippets provided here (including but not limited to loss of data or data being rendered inaccurate, or losses sustained by you or third parties, or a failure of the command/script/snippets to operate with any other programs), even if such holder or other party has been advised of the possibility of such damages.
Writing articles like this one requires time and resources. If you found it helpful or even if you didn't, I'd love to hear from you—whether you have feedback, suggestions, or spotted any bugs or typos. Your input would mean the world to me! You can reach out using the email address listed in the imprint.
Installation
make.conf
Configure your system for usage with a NVIDIA GPU in /etc/portage/make.conf:
VIDEO_CARDS="nvidia"Accept keywords
ACCEPT_KEYWORDS in Gentoo specify if the package manager is allowed to accept testing version or if only stable versions should be used.
Configure the following keywords as needed in /etc/portage/package.accept_keywords. This depends on the current state of the repository as well as your needs regarding being stable or running bleeding edge.
app-containers/nvidia-container-toolkit
app-containers/podman-composeUSE flags
USE flags in Gentoo are keywords that define support, features and dependencies. They allow to configure how packages are built and installed by portage.
At the time of writing this are the recommended use flags for the most recent version of x11-drivers/nvidia-drivers. Edit your /etc/portage/package.use file accordingly:
*/* nvidia
x11-drivers/nvidia-drivers kernel-openEmerging the packages
Compile and install the following packages:
sudo emerge -a &&\
x11-drivers/nvidia-drivers &&\
app-containers/nvidia-container-toolkit &&\
app-containers/podman &&\
app-containers/podman-composenvtop can be a good tool to check your GPUs. NVIDIA CUDA images for podman can also be quite useful for testing if everything works as intended.
nvidia-container-toolkit
Container Device Interface (CDI) is a specification for container runtimes like podman. It standardizes access to devices like the GPU by the container runtime.
If you list the NVIDIA devices in /dev - directly after the install - something similar to this (the output will vary depending on your GPU count):
> ls -l /dev/nv*
/dev/nvidia0
/dev/nvidia1
/dev/nvidia2
/dev/nvidia3
/dev/nvidia4
/dev/nvidia5
/dev/nvidia6
/dev/nvidia7
/dev/nvidiactl
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvramIf you do a restart and do it again you will find that most of these devices are not there anymore. This is a problem, because these CDI devices are needed for the nvidia-container-toolkit to function with the container runtime (podman).
Directly after the install the setup script will run the necessary command to generate the CDI devices, but those are not persistent after a reboot... so we need to make sure that these are generated on boot.
I opted to set up a systemd service unit for this:
sudo nvim /etc/systemd/system/nvidia-cdi.serviceAdd the following service unit:
[Unit]
Description=Generate NVIDIA CDI configuration
[Service]
Type=oneshot
ExecStart=/usr/bin/nvidia-ctk cdi generate --output="/etc/cdi/nvidia.yaml"
[Install]
WantedBy=multi-user.targetEnable the service and start the generation:
sudo systemctl daemon-reload
sudo systemctl enable --now nvidia-cdi.serviceAfter a restart you will see that the necessary devices were created.
podman-compose
podman-compose.yaml
The next step varies depending on the service/container you want to provide, so I will give a basic example.
First you need to login as the user which should run the container (depending if you go rootless or root). I will go rootless here and use a dummy user called dummy.
mkdir -p ~/pods/podA
nvim ~/pods/podA/podman-compose.yamlNext you have to define the service as well as allow it access to the GPU:
---
services:
containerA:
image: some/image
restart: unless-stopped
networks:
- backend
volumes:
- volume:/some/path
############################# RELEVANT START
devices:
- nvidia.com/gpu=all
############################# RELEVANT STOP
...The relevant part that you can spot here is listed under devices. Notice the difference to a corresponding docker-compose.yaml file in which you would provide something along these lines (also see docker docs):
---
services:
containerA:
image: some/image
restart: unless-stopped
networks:
- backend
volumes:
- volume:/some/path
############################# RELEVANT START
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 8
capabilities: [gpu]
############################# RELEVANT STOP
...NVIDIA GPU in rootless container
To use NVIDIA in a rootless container in Gentoo we need some extra step. Edit the file ~/.config/containers/containers.conf and add the following settings to enable sharing user groups inside containers (from Gentoo Wiki):
[containers]
annotations=["run.oci.keep_original_groups=1",]If you want more background information, check RedHat and man crun.
Automatically start container on boot
Setting up systemd units for user
To automatically start the pod (containers) on boot we need to set up the podman-compose unit files first:
sudo podman-compose systemd -a create-unitYou can check if the service podman-compose@.service now exists for you user after reloading the systemd daemon:
systemctl --user daemon-reload
systemctl --user list-unit-files | grep podman-composeRegister compose stack for user
Next we register the compose stack for the user. Go to your project directory:
cd ~/pods/podANow register the compose stack with podman-compose:
podman-compose systemd -a registerYou can check how it was set up and if it was successful:
> ls ~/.config/containers/compose/projects/podA.env
> bat ~/.config/containers/compose/projects/mind.env
───────┬──────────────────────────────────────────────────────────────────
│ File: /home/dummy/.config/containers/compose/projects/mind.env
───────┼──────────────────────────────────────────────────────────────────
1 │ COMPOSE_PROJECT_DIR=/home/dummy/pods/podA
2 │ COMPOSE_FILE=podman-compose.yaml
3 │ COMPOSE_PATH_SEPARATOR=:
4 │ COMPOSE_PROJECT_NAME=podA
───────┴──────────────────────────────────────────────────────────────────Customize the service
Now you could manage you pod with systemd already:
systemctl --user daemon-reload
systemctl --user start
systemctl --user stop But to prevent possible problem with a race condition on the computer start we want to make sure that the service only starts after die CDI files were already generate by the service we created above. So we will modify the systemd service unit with an override:
systemctl --user edit We will make the following override:
### Editing /home/dummy/.config/systemd/user/.d/override.conf
### Anything between here and the comment below will become the contents of the drop-in file
[Unit]
After=nvidia-cdi.service
### Edits below this comment will be discarded
### /etc/xdg/systemd/user/podman-compose@.service
# # /etc/systemd/user/podman-compose@.service
#
# [Unit]
# Description=%i rootless pod (podman-compose)
#
# [Service]
# Type=simple
# EnvironmentFile=%h/.config/containers/compose/projects/%i.env
# ExecStartPre=-/usr/lib/python-exec/python3.12/podman-compose up --no-start
# ExecStartPre=/usr/bin/podman pod start pod_%i
# ExecStart=/usr/lib/python-exec/python3.12/podman-compose wait
# ExecStop=/usr/bin/podman pod stop pod_%i
#
# [Install]
# WantedBy=default.target
This makes sure that the service is only started after nvidia-cdi.service has already run. If you want, you can also make an addition to the nvidia-cdi.service and add Before= under the description, but this should not be necessary.
So we have now the system-wide service unit for generating the CDI devices on startup. After that is finished our user service unit will rootless start the pod (the containers).
Enable lingering
Because we want to start the containers even when the computer has started, but the user hasn't logged in yet, we enable lingering for our dummy user:
loginctl enable-linger dummyYou can check if it is enabled by issuing:
loginctl list-usersEnable the user service
Now we can enable the user service and test our setup:
systemctl --user daemon-reload
systemctl --user enable --now
systemctl --user status After a restart you can check the status of you pod with one of the following possibilities:
podman pod ls
podman pod stats 'pod_podA'
podman pod logs --tail=10 -f 'pod_podA'
cd ~/pods/podA &&\
podman-compose ps