Skip to content

Gotchas

1. A note about performance

In removing hardware concerns, containers can create an easy, friendly user experience. A cost of this ease-of-use is that containerized applications commonly will not achieve the same performance as would natively built applications, which might, for example, call libraries optimized for the specific hardware and underlying architecture.

2. Competing for --userns: Multi-node allocations

The --userns flag confers the privileges needed to build a container to only one node in an allocation. If you use salloc --userns ... to request a multi-node allocation, you'll find that you are only able to build containers on one of the nodes in that allocation.

For example, on pascal I requested an allocation with two nodes and was granted pascall 7 and 8:

janeh@pascal83:~$ salloc -N 2 -t 10 --userns salloc: Pending job allocation 1535493 salloc: job 1535493 queued and waiting for resources salloc: job 1535493 has been allocated resources salloc: Granted job allocation 1535493 salloc: Waiting for resource configuration salloc: Nodes pascal[7-8] are ready for job

While I was able to run podman build on pascal8 successfully, I hit the following error messages on pascal7:

janeh@pascal7:~$ podman build -f Dockerfile.ubuntu -t ubuntu_test cannot clone: Invalid argument user namespaces are not enabled in /proc/sys/user/max_user_namespaces Error: could not get runtime: cannot re-exec process

3. Competing for --userns: Serially scheduled nodes

Avoid building containers on serially scheduled nodes, like those on oslic or boraxo. Only one user per node can use --userns to build containers successfully at a time. You may run into issues building containers if you are competing for resources on a shared node.

For example, I saw the following error messages while trying to build a container in an allocation on boraxo8 after starting another allocation on boraxo8 via salloc --userns ... under a second username:

janeh@boraxo3:~$ salloc -n1 -ppdebug -t 20 --userns salloc: Granted job allocation 3387202 salloc: Waiting for resource configuration salloc: Nodes boraxo8 are ready for job janeh@boraxo8:~$ ./enable-podman.sh janeh@boraxo8:~$ podman build -f Dockerfile.ubuntu -t ubuntuimage ERRO[0000] cannot find mappings for user janeh: No subuid ranges found for user "janeh" in /etc/subuid ERRO[0000] cannot find mappings for user janeh: No subuid ranges found for user "janeh" in /etc/subuid STEP 1: FROM ubuntu:18.04 Error: error creating build container: error creating container: error creating read-write layer with ID "9e337546787915ffac51bbb2bd0e2e244b21baf26ca10a6ed33949eb54137e60": there might not be enough IDs available in the namespace (requested 0:65534 for /var/tmp/janeh/config/containers/storage/vfs/dir/9e337546787915ffac51bbb2bd0e2e244b21baf26ca10a6ed33949eb54137e60/etc/gshadow): lchown /var/tmp/janeh/config/containers/storage/vfs/dir/9e337546787915ffac51bbb2bd0e2e244b21baf26ca10a6ed33949eb54137e60/etc/gshadow: invalid argument

4. Memory issues with large container images

When working with sufficiently large container images, you can run into memory issues. For example, you could have a process terminated by an Out of Memory (OOM) killer, or you could see "FATAL" error messages as shown here:

``` janeh@pascal32:/p/lustre1/janeh$ singularity build oom-build-030122.img docker://ecpe4s/e4s-gpu INFO: Starting build... Getting image source signatures

(...)

INFO: Creating SIF file... FATAL: While performing build: while creating squashfs: create command failed: exit status 1: Write failed because No space left on device

FATAL ERROR:Failed to write to output filesystem ```

Note that this example uses singularity, rather than podman, and that the possibility of memory issues when working with large container images exists for both. The threshold for encountering memory issues is lower when using podman and vfs together than it is for using either singularity or podman and overlayfs.