Caching Docker Rails images
This post shows a simple way of improving Docker images built process. Using this technique reduces both required storage and build time.
Problem
Docker images are built using layers, each command in Dockerfile creates a layer. Read more about images, layers and storage drivers in Docker docs.
Usually all Rails Dockerfiles have 3 common parts as layers:
- installing gems in
bundle install
(RUN bundle install
) - copying all files
- precompiling assets
Exemplary Dockerfile (Github):
FROM ruby:2.5.1-alpine
WORKDIR /app
COPY Gemfile Gemfile.lock ./
RUN bundle --deployment --without development test
COPY . .
ENV RAILS_ENV production
RUN bundle exec rake assets:precompile
In this setup when there’s a change then COPY
and all higher layers, including rake assets:precompile
have to be rebuild from scratch.
It is even worse when a Gemfile
or Gemfile.lock
is changed, as bundle install
starts from scratch with no cache.
This is unacceptable that adding a blank line to the code produces big artifact and restarts time consuming tasks from scratch.
What we want to achieve is to have a way to add incremental changes as small diffs - patches, not replacements.
Master image
The simplest solution is to create a master/bootstrap image, that incremental images are be built on.
Firstly create and tag the master build. In my repository I use Dockerfile Dockerfile.production.simplified
and tag it as rails-example-app-master
:
$ docker build -f Dockerfile.production.simplified . -t rails-example-app-master
Then create a copy of your Dockerfile and change the first line with FROM
command to:
FROM rails-example-app-master
I saved it as Dockerfile.production.simplified.cached
:
FROM rails-example-app-master
WORKDIR /app
COPY Gemfile Gemfile.lock ./
RUN bundle --deployment --without development test
COPY . .
ENV RAILS_ENV production
RUN bundle exec rake assets:precompile
Let’s try it by buildinga a new image with newly added assets. I use 1.4MB picture from NASA:
$ wget https://apod.nasa.gov/apod/image/1808/heic1404b1920.jpg -P app/assets/images
In my case I build the image using:
$ docker build -f Dockerfile.production.simplified.cached . -t rails-example-app-cached-image
What happens?
All commands from Dockerfile are executed again. However this time we already have a lot of data loaded in the image. Let’s take a look in details at some of them:
RUN bundle --deployment --without development test
- all gems are already installed in the parent image. Because there’s nothing to do itbundle
quickly exits with no changes.COPY . .
- internally Docker is able to store a difference of changes when callingCOPY
. All files, except for the new asset, are already in the image. The only the diffrence (new image file) is stored in this layer.RUN bundle exec rake assets:precompile
- old assets are already precompiled so there’s no need to do that again. The image that was added is precompiled and the result is stored.
We can inspect layers using docker history
command.
In my case it looks like this:
$ docker history rails-example-app-cached-image
IMAGE CREATED CREATED SIZE
6450cf8b0 18 seconds ago /bin/sh -c #(nop) ENV RAILS_LOG_TO_STDO 0B
d0f791c49 20 seconds ago /bin/sh -c #(nop) ENV RAILS_SERVE_STATI 0B
a4a7779e4 22 seconds ago /bin/sh -c bundle exec rake assets:preco 1.74MB
2b77d3b1a 28 seconds ago /bin/sh -c #(nop) ENV RAILS_ENV=product 0B
6d9f380d3 30 seconds ago /bin/sh -c #(nop) COPY dir:370f291eb553b 1.51MB
6892c696a 32 seconds ago /bin/sh -c bundle --deployment --without 90B
24d2f3c52 35 seconds ago /bin/sh -c #(nop) COPY multi:5f12d01bf90 0B
e5a2749c0 37 seconds ago /bin/sh -c #(nop) WORKDIR /app 0B
3bf1c6640 39 seconds ago /bin/sh -c apk add --no-cache --update b 853kB
869f40cb4 4 minutes ago /bin/sh -c #(nop) ENV RAILS_LOG_TO_STDO 0B
947876927 4 minutes ago /bin/sh -c #(nop) ENV RAILS_SERVE_STATI 0B
3c9a21fc8 4 minutes ago /bin/sh -c bundle exec rake assets:preco 26.3MB
e4248596b 5 minutes ago /bin/sh -c #(nop) ENV RAILS_ENV=product 0B
32639ec16 5 minutes ago /bin/sh -c #(nop) COPY dir:bcef82e4178ba 52.8MB
f4a48e1f7 About an hour ago /bin/sh -c bundle --deployment --without 121MB
993849839 About an hour ago /bin/sh -c #(nop) COPY multi:5f12d01bf90 6.93kB
1b0cd9df5 About an hour ago /bin/sh -c #(nop) WORKDIR /app 0B
7e368ce8c About an hour ago /bin/sh -c apk add --no-cache --update b 202MB
d82225343 3 days ago /bin/sh -c #(nop) CMD ["irb"] 0B
<missing> 3 days ago /bin/sh -c mkdir -p "$GEM_HOME" && chmod 0B
<missing> 3 days ago /bin/sh -c #(nop) ENV PATH=/usr/local/b 0B
<missing> 3 days ago /bin/sh -c #(nop) ENV BUNDLE_PATH=/usr/ 0B
<missing> 3 days ago /bin/sh -c #(nop) ENV GEM_HOME=/usr/loc 0B
<missing> 3 days ago /bin/sh -c set -ex && apk add --no-cac 57.8MB
<missing> 3 days ago /bin/sh -c #(nop) ENV BUNDLER_VERSION=1 0B
<missing> 7 weeks ago /bin/sh -c #(nop) ENV RUBYGEMS_VERSION= 0B
<missing> 7 weeks ago /bin/sh -c #(nop) ENV RUBY_DOWNLOAD_SHA 0B
<missing> 7 weeks ago /bin/sh -c #(nop) ENV RUBY_VERSION=2.5. 0B
<missing> 7 weeks ago /bin/sh -c #(nop) ENV RUBY_MAJOR=2.5 0B
<missing> 7 weeks ago /bin/sh -c mkdir -p /usr/local/etc && { 45B
<missing> 7 weeks ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 7 weeks ago /bin/sh -c #(nop) ADD file:6ee19b92d5cb1 4.2MB
Layer 869f40cb4
and all the lower layers
(including <missing> layers from
ruby:2.5.1-alpine
image) come from the master build.
Layers 3bf1c664
and higher were created during child image build.
To verify that only necessarry diff was stored in the layers take a look at the last column of the listing, which represents layer size:
6450cf8b 18 seconds ago /bin/sh -c #(nop) ENV RAILS_LOG_TO_STDOUT=t… 0B
d0f791c4 20 seconds ago /bin/sh -c #(nop) ENV RAILS_SERVE_STATIC_FI… 0B
a4a7779e 22 seconds ago /bin/sh -c bundle exec rake assets:precompile 1.74MB
2b77d3b1 28 seconds ago /bin/sh -c #(nop) ENV RAILS_ENV=production 0B
6d9f380d 30 seconds ago /bin/sh -c #(nop) COPY dir:370f291eb553b0623… 1.51MB
6892c696 32 seconds ago /bin/sh -c bundle --deployment --without dev… 90B
24d2f3c5 35 seconds ago /bin/sh -c #(nop) COPY multi:5f12d01bf9056d0… 0B
e5a2749c 37 seconds ago /bin/sh -c #(nop) WORKDIR /app 0B
3bf1c664 39 seconds ago /bin/sh -c apk add --no-cache --update build… 853kB
Some noop operations like bundle
or apk add
are not 0B size and add some small trace, but it’s totally negligible.
The most important part is that the size of COPY . .
(6d9f380d
, 1.51MB) and rake assets:precompile
(a4a7779e
, 1.74MB) layers
are small, reflecting newly added files size.
Issue with incremental COPY
For this moment there’s an open bug #21950 in Docker that describes a problem with incremental COPY
on some drivers.
It affects new storage drivers such as aufs
and overlay2
. Old overlay
works fine.
I suggest using overlay
storage driver on the build server until this gets fixed.
Summary
Using master image improves both speed and storage usage.
The only challenge left is to keep the master build up to date to keep the diff cached images small. It’s a good idea to automate this process.
EDIT 1/11/2018
In the next blog post I wrote about
improved solution using new, experimental RUN --mount
feature.