Editing your app’s Dockerfile
To set up custom programs for your app, such as third-party tools, you can edit your Dockerfile, which contains the setup code needed to build up your container.
A helpful resource is the Dockerfile Best Practices guide.
The first two lines of the Dockerfile look something like:
FROM kbase/sdkbase2:latest
MAINTAINER KBase Developer
Change KBase Developer
to be the email address of the person
who will be responsible for the upkeep of the module.
The base KBase Docker image contains a KBase Ubuntu image with the dependencies necessary JSON-RPC server in the supported language, as well as a core set of KBase API Clients. You will need to include any dependencies, including the installation of whatever tool you are wrapping, to build a custom Docker image that can run your Module.
For example:
RUN \
git clone https://github.com/voutcn/megahit.git && \
cd megahit && \
git checkout tags/v1.0.3 && \
make
Note: you do not have to add any lines to the Dockerfile for installing your SDK Module code, as this will happen automatically. The contents of your SDK Module git repo will be added at /kb/module.
It is also important to note that layers in the Docker image are generated by each command in the Dockerfile. To make a streamlined Docker image which will deploy on a worker node more quickly and make cycles of a Docker image build faster, it is best to remove any large, extraneous files, such as a tarball from which you installed a tool.
To accomplish this, commands to your Dockerfile should look like this:
RUN \
wget blah.com/tarball.tgz && \
tar xzf tarball.tgz && \
cd tarball && \
make && \
cd .. && \
rm -rf tarball.tgz
Where each &&
lets you chain together multiple bash commands, and
the backslash continues the same, single-line command over multiple
lines.
Avoid this:
RUN wget blah.com/tarball.tgz
RUN tar xzf tarball.tgz
RUN cd tarball
RUN make
RUN cd ..
RUN rm -rf tarball.tgz
Each call to RUN
creates a separate Docker image layer that sits on
top of the previous one. Previous layers are read-only, so you can’t
modify their content if you wanted to clean up files. In general, you
will want one RUN
command for each discrete service that you set up
in your container.
Final note: Docker will rebuild everything from the first detected
change in a dockerfile but pull everything upstream of that from its
cache. If you are pulling in external data using RUN
and a command
like git clone
or wget
, then changes in those sources will not
automatically be reflected in a rebuilt Docker image unless the Docker
file changes at or before that import.