Problem description

tl;dr

Files written in host volume created by root user in Docker containers are also owned by root on the host. Use Docker namespaces to map your host user to the container root.

Docker host volumes

In local development using languages and frameworks that support hot code reloading it is common to share a directory on your host system with the Docker container. This can be achieved using built-in Docker feature called host volumes. This way you don’t have to rebuild the image every time you change make a change.

Docker is a Linux only solution based on Linux Containers. It is possible to run Docker on Mac OS or Windows hosts but underneath it uses a virtual machine running Linux with docker deamon. In these systems Docker host volumes are implemented using NFS or some other customer solutions.

When Docker runs natively on a Linux host then it can directly access files on the host filesystem.

Host volumes on Linux hosts

By default there’s no mapping of permissions and ownership between Linux host and Docker container. In a Docker container you see the same permissions of files on host volumes as on a host. What’s more interesting and problematic is that it works the other way too.

Example

Let’s create a directory with 2 files. All commands in this example are executed on the Linux host.

$ mkdir example
$ cd example
$ touch file1 file2
$ chmod 777 file2
$ ls -l
total 0
-rw-rw-r-- 1 jan jan 0 Jun 10 10:22 file1
-rwxrwxrwx 1 jan jan 0 Jun 10 10:22 file2

Now check how Docker container sees them.

$ docker run --rm -v $PWD:/example -w /example alpine /bin/ls -l
total 0
-rw-rw-r--    1 1000     1000             0 Jun 10 17:22 file1
-rwxrwxrwx    1 1000     1000             0 Jun 10 17:22 file2
#

You can see that while permissions (-rw-rw-r-- and -rwxrwxrwx) are unchanged the ownership columns has changed from jan to 1000.

While the output is different the actual ownership hasn’t really changed. 1000 is my UID and GID

It can be verified using id command:

$ id
uid=1000(jan) gid=1000(jan) groups=1000(jan),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),113(lpadmin),128(sambashare),999(docker)

In the Docker container my user and group jan doesn’t exist thus ls is not able to map the files ownership IDs to user labels.

As I said earlier most Docker images use root default user by default. Alpine is no exception here:

$ docker run --rm alpine whoami
root

So what happens if we create a file in the host volume?

$ docker run --rm -v $PWD:/example -w /example alpine touch file3
$ ls -l
total 0
-rw-rw-r-- 1 jan  jan  0 Jun 10 10:22 file1
-rwxrwxrwx 1 jan  jan  0 Jun 10 10:22 file2
-rw-r--r-- 1 root root 0 Jun 10 10:40 file3

The file has root:root permissions!

This behaviour is usually unexpected. For a practical example please see the previous post about bootstrapping Rails with Docker.

Solution

Thanks to user namespace it is possible to remap UID and GIDs in the containers.

Namespacing isolates Docker files such as images and container files in /var/lib/docker/.

All existing images and containers will not be available after enabling user namespace.

From the docs:

Enabling userns-remap effectively masks existing image and container layers, as well as other Docker objects within /var/lib/docker/. This is because Docker needs to adjust the ownership of these resources and actually stores them in a subdirectory within /var/lib/docker/. It is best to enable this feature on a new Docker installation rather than an existing one.

User namespace and all the other namespaces were created to protect hosts, limiting processes in containers from accessing host resources.

It is usually done by assigning some very high UID and GID to be sure that they don’t interfere with IDs on the host.

In this scenario we want to use (abuse?) this system to map root (with ID 0) to our local Linux user.

Step 1 - create user namespace

Using id find out what’s your UID and the effective (primary) GID. Usually on popular distributions like Ubuntu both are equal 1000.

$ id
uid=1000(jan) gid=1000(jan) groups=1000(jan),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),113(lpadmin),128(sambashare),999(docker)
$ id -u
1000
$ id -g
1000

Edit /etc/subuid and /etc/subgid to set user and group namespace respectively. In my setup on Ubuntu these files exist by default defining namespace equal to my username.

Namespace definition format is described in more details in the docs. It is a colon seperated triplet describing:

  • namespace name
  • first ID in the namespace
  • number of possible IDs in the namespace

In my case both /etc/subuid and /etc/subgid have the same content:

jan:1000:65536

65536 doesn’t mean anything, it’s just a very high value.

Step 2 - enable namespace in Docker

Edit /etc/docker/daemon.json using the following template (you might need to create this file if it doesn’t exist):

{
  "userns-remap": "NAMESPACE_NAME"
}

In my case it is:

{
  "userns-remap": "jan"
}

Now restart Docker. On Ubuntu by default it can be done using service docker restart.

As I mentioned earlier all your previous containers are images are not available after enabling namespace. They are masked, not deleted. You can go back to the default namespace by reverting this change.

Test!

Given the file structure from the previous section:

$ ls -l 
total 0
-rw-rw-r-- 1 jan  jan  0 Jun 10 10:22 file1
-rwxrwxrwx 1 jan  jan  0 Jun 10 10:22 file2
-rw-r--r-- 1 root root 0 Jun 10 10:40 file3

Let’s check how Docker perceives the files now:

docker run --rm -v $PWD:/example -w /example alpine /bin/ls -l
total 0
-rw-rw-r--    1 root     root             0 Jun 10 17:22 file1
-rwxrwxrwx    1 root     root             0 Jun 10 17:22 file2
-rw-r--r--    1 nobody   nobody           0 Jun 10 17:40 file3

Great! Now my user and group are properly mapped inside the container to root.

Notice that container root’s file3 is marked as nobody’s.

Final check - write inside the container.

$ docker run --rm -v $PWD:/example -w /example alpine touch file4
$ ls -l
total 0
-rw-rw-r-- 1 jan  jan  0 Jun 10 10:22 file1
-rwxrwxrwx 1 jan  jan  0 Jun 10 10:22 file2
-rw-r--r-- 1 root root 0 Jun 10 10:40 file3
-rw-r--r-- 1 jan  jan  0 Jun 10 11:52 file4

file4 was mapped to jan in the host as expected.

Resources