This chapter is quite different from the earlier ones, and it is in this chapter to clearly describe how the Docker images are built using Dockerfile.
Content:
• Docker's integrated image building system
• Quick overview of the Dockerfile's syntax
• Dockerfile build instructions
• How Docker stores images
(1) Docker's integrated image buildingsystem
Dockerfile is a text-based build script that contains special instructions in a sequence for building the right and relevant images from the base images. The sequential instructions inside the Dockerfile can include the base image selection, installing the required application, adding the configuration and the data files, and automatically running the services as well as exposing those services to the external world. It also offers a great deal of flexibility in the way in which the build instructions are organized and in the way in which they visualize the complete build process.
The Docker engine tightly integrates this build process with the help of the docker build subcommand. In the client-server paradigm of Docker, the Docker server (or daemon) is responsible for the complete build process and the Docker command line interface is responsible for transferring the build context, including transferring Dockerfile to the daemon.
Our Dockerfile is made up of two instructions, as shown here:
$ cat Dockerfile FROM busybox:latest CMD echo Hello World!!
In the following, we will discuss the two instructions mentioned earlier:
• The first instruction is for choosing the base image selection. In this example, we will select the busybox:latest image
• The second instruction is for carrying out the command CMD, which instructs
the container to echo Hello World!!.
Now, let's proceed towards generating a Docker image by using the
preceding Dockerfile by calling docker build along with the path of Dockerfile.
We will invoke the docker build subcommand from the directory where we have
stored Dockerfile, and the path will be specified by the following command:
$ sudo docker build .
The build process would continue and, after completing itself, it will display the following:
Successfully built ************
Let's use this image to launch a container by using the docker run subcommand as follows:
$ sudo docker run ************ Hello World!!
Now let's look at the image details by using the docker images subcommand, as shown here:
$ sudo docker images
The IMAGE (REPOSITORY) and TAG name have been listed as <none>. This is because we did not specify any image or any TAG name when we built this image. You could specify an IMAGE name and optionally a TAG name by using the docker tag subcommand, as shown here:
$ sudo docker tag ************ busyboxplus
The alternative approach is to build the image with an image name during the build time by using the -t option for the docker build subcommand, as shown here:
$ sudo docker build -t busyboxplus .
Since there is no change in the instructions in Dockerfile, the Docker engine will efficiently reuse the old image that has ID ************ and update the image name to busyboxplus. By default, the build system would apply latest as the TAG name. This behavior can be modified by specifying the TAG name after the IMAGE name by having a : separator placed between them. That is, <image name>:<tag name> is the correct syntax for modifying behaviors, where <image name> is the name of the image and <tag name> is the name of the tag. For examples:
$ sudo docker build -t busyboxplus:mytest .
(2) A quick overview of the Dockerfile's syntax
A Dockerfile is made up of instructions, comments, and empty lines, as shown here:
# Comment INSTRUCTION arguments
The instruction line of Dockerfile is made up of two components, where the instruction line begins with the instruction itself, which is followed by the arguments for the instruction. The instruction could be written in any case, in other words, it is case-insensitive. However, the standard practice or convention is to use uppercase in order to differentiate it from the arguments. Let's take another look at the content of Dockerfile in our previous example:
FROM busybox:latest CMD echo Hello World!!
The comment line in Dockerfile must begin with the # symbol. The # symbol after an instruction is considered as an argument. If the # symbol is preceded by a whitespace, then the docker build system would consider that as an unknown instruction and skip the line.
The docker build system ignores any empty line in the Dockerfile, and so the author of Dockerfile is encouraged to add comments and empty lines to substantially improve the readability of Dockerfile.
(3) The Dockerfile build instructions
In this section, we will introduce the Dockerfile instructions, their syntax, and a few ftting examples.
The FROM instruction
The FROM instruction is the most important one and it is the first valid instruction of a Dockerfile. By default, the docker build system looks in the Docker host for the images. However, if the image is not found in the Docker host, then the docker build system will pull the image from the publicly available Docker Hub Registry. The docker build system will return an error if it can’t find the specified image in the Docker host and the Docker Hub Registry.
The FROM instruction has the following syntax:
FROM <image>[:<tag>]
Docker allows multiple FROM instructions in a single Dockerfile in order to create multiple images. The Docker build system will pull all the images specified in the FROM instruction. Docker does not provide any mechanism for naming the individual images that are generated with the help of multiple FROM instructions. We strongly discourage using multiple FROM instructions in a single Dockerfile, as damaging conflicts could arise.
The MAINTAINER instruction
The MAINTAINER instruction is an informational instruction of a Dockerfile. This instruction capability enables the authors to set the details in an image. Docker does not place any restrictions on placing the MAINTAINER instruction in Dockerfile. However, it is strongly recommended that you should place it after the FROM instruction.
The following is the syntax of the MAINTAINER instruction, where <author's detail> can be in any text. However, it is strongly recommended that you use the image author's name and the e-mail address, as shown in this code syntax:
MAINTAINER <author's detail>
For example:
MAINTAINER Dr. Peter <peterindia@gmail.com>
The COPY instruction
The COPY instruction enables you to copy the files from the Docker host to the filesystem of the new image. The following is the syntax of the COPY instruction:
COPY <src> ... <dst>
The preceding code terms bear the explanations shown here:
• <src>: This is the source directory, the file in the build context, or the directory from where the docker build subcommand was invoked.
• ...: This indicates that multiple source files can either be specified directly or be specified by wildcards.
• <dst>: This is the destination path for the new image into which the source file or directory will get copied. If multiple files have been specified, then the destination path must be a directory and it must end with a slash /.
Using an absolute path for the destination directory or a file is recommended.
In the following example, we will copy the html directory from the source build context to /var/www/html, which is in the image filesystem, by using the COPY instruction, as shown here:
COPY html /var/www/html
Here is another example of the multiple files (httpd.conf and magic) that will be copied from the source build context to /etc/httpd/conf/, which is in the image filesystem:
COPY httpd.conf magic /etc/httpd/conf/
The ADD instruction
The ADD instruction is similar to the COPY instruction. However, in addition to the functionality supported by the COPY instruction, the ADD instruction can handle the TAR files and the remote URLs. We can annotate the ADD instruction as COPY on steroids.
The following is the syntax of the ADD instruction:
ADD <src> ... <dst>
The arguments of the ADD instruction are very similar to those of the COPY instruction, as shown here:
• <src>: This is either the source directory or the file that is in the build context or in the directory from where the docker build subcommand will be invoked. However, the noteworthy difference is that the source can either be a TAR file stored in the build context or be a remote URL.
• ...: This indicates that the multiple source files can either be specified directly or be specified by using wildcards.
• <dst>: This is the destination path for the new image into which the source file or directory will be copied.
Thus the TAR option of the ADD instruction can be used for copying multiple files to the target image.
The ENV instruction
The ENV instruction sets an environment variable in the new image. An environment variable is a key-value pair, which can be accessed by any script or application. The Linux applications use the environment variables a lot for a starting configuration.
The following line forms the syntax of the ENV instruction:
ENV <key> <value>
Here, the code terms indicate the following:
• <key>: This is the environment variable
• <value>: This is the value that is to be set for the environment variable
The following lines give two examples for the ENV instruction, where in the first line DEBUG_LVL has been set to 3 and, in the second line, APACHE_LOG_DIR has been set to /var/log/apache:
ENV DEBUG_LVL ENV APACHE_LOG_DIR /var/log/apache
The USER instruction
The USER instruction sets the start up user ID or user Name in the new image. By default, the containers will be launched with root as the user ID or UID. Essentially, the USER instruction will modify the default user ID from root to the one specified in this instruction.
The syntax of the USER instruction is as follows:
USER <UID>|<UName>
The USER instructions accept either <UID> or <UName> as its argument:
• <UID>: This is a numerical user ID
• <UName>: This is a valid user Name
The following is an example for setting the default user ID at the time of startup to 73. Here 73 is the numerical ID of the user:
USER
However, it is recommended that you have a valid user ID to match with the /etc/passwd fle, the user ID can contain any random numerical value. However, the user Name must match with a valid user name in the /etc/passwd fle, otherwise the docker run subcommand will fail and it will display the following error message:
finalize namespace setup user get supplementary groups Unable to find user
The WORKDIR instruction
The WORKDIR instruction changes the current working directory from / to the path specifed by this instruction.
The following line gives the appropriate syntax for the WORKDIR instruction:
WORKDIR <dirpath>
Here, <dirpath> is the path for the working directory to set in. The path can be either absolute or relative. In case of a relative path, it will be relative to the previous path set by the WORKDIR instruction. If the specified directory is not found in the target image filesystem, then the director will be created.
The following line is a clear example of the WORKDIR instruction in a Dockerfile:
WORKDIR /var/log
The VOLUME instruction
The VOLUME instruction creates a directory in the image filesystem, which can later be used for mounting volumes from the Docker host or the other containers.
The VOLUME instruction has two types of syntax, as shown here:
• The frst type is either exec or JSON array (all values must be within doublequotes (")):
VOLUME ["<mountpoint>"]
• The second type is shell, as shown here:
VOLUME <mountpoint>
In the preceding line, <mountpoint> is the mount point that has to be created in the new image.
The EXPOSE instruction
The EXPOSE instruction opens up a container network port for communicating between the container and the external world.
The syntax of the EXPOSE instruction is as follows:
EXPOSE <port>[/<proto>] [<port>[/<proto>]...]
Here, the code terms mean the following:
• <port>: This is the network port that has to be exposed to the outside world.
• <proto>: This is an optional field provided for a specific transport protocol, such as TCP and UDP. If no transport protocol has been specified, then TCP is assumed to be the transport protocol.
The EXPOSE instruction allows you to specify multiple ports in a single line.
The following is an example of the EXPOSE instruction inside a Dockerfile exposing the port number 7373 as a UDP port and the port number 8080 as a TCP port. As mentioned earlier, if the transport protocol has not been specified, then the TCP transport is assumed to be the transport protocol:
EXPOSE /udp
The RUN instruction
The general recommendation is to execute multiple commands by using one RUN instruction. This reduces the layers in the resulting Docker image because the Docker system inherently creates a layer for each time an instruction is called in Dockerfile.
The RUN instruction has two types of syntax:
• The first is the shell type, as shown here:
RUN <command>
If this type of syntax is to be used, then the command is always executed by using /bin/sh –c.
• The second syntax type is either exec or the JSON array, as shown here:
RUN ["<exec>", "<arg-1>", ..., "<arg-n>"]
Within this, the code terms mean the following:
° <exec>: This is the executable to run during the build time.
° <arg-1>, ..., <arg-n>: These are the variables (zero or more) number of the arguments for the executable.
Unlike the first type of syntax, this type does not invoke /bin/sh -c. However, if you still prefer the exec (JSON array type) type, then use your preferred shell as the executable and supply the command as an argument.
For example, RUN ["bash", "-c", "rm", "-rf", "/tmp/abc"]
The example is a Dockerfile, which has the instructions for crafting an Apache2 application image on top of the Ubuntu 14.04 base image.
FROM ubuntu:14.04 # Install apache2 package RUN apt-get update && \ apt-get install -y apache2 && apt-get clean
The CMD instruction
The CMD instruction can run any command (or application), which is similar to the RUN instruction. However, the major difference between those two is the time of execution. The command supplied through the RUN instruction is executed during the build time, whereas the command specified through the CMD instruction is executed when the container is launched from the newly created image. However, it can be overridden by the docker run subcommand arguments. When the application terminates, the container will also terminate along with the application and vice versa.
The CMD instruction has three types of syntax, as shown here:
•The first syntax type is the shell type, as shown here:
CMD <command>
•The second type of syntax is exec or the JSON array, as shown here:
CMD ["<exec>", "<arg-1>", ..., "<arg-n>"]
•The third type of syntax is also exec or the JSON array, which is similar to the previous type. However, this type is used for setting the default parameters to the ENTRYPOINT instruction, as shown here:
CMD ["<arg-1>", ..., "<arg-n>"]
Within this, the code terms mean the following:
° <arg-1>, ..., <arg-n>: These are the variable (zero or more) numbers of the arguments for the ENTRYPOINT instruction.
In the case of multiple CMD instructions, only the last CMD instruction would be effective.
Let’s see the override’s example:
$ sudo docker build -t cmd-demo . $ sudo docker run cmd-demo echo Override CMD demo Override CMD demo (no Dockerfile’s echo)
The ENTRYPOINT instruction
The ENTRYPOINT instruction will help in crafting an image for running an application (entry point) during the complete life cycle of the container, which would have been spun out of the image. The ENTRYPOINT is similar with CMD, but also have some differences. Let’s see a example to learn:
CMD:
Dockerfile: FROM busybox:latest CMD [“echo”, ”Hello”]
Then we build the image through docker build command. Let’s see the CMD characters.
$ sudo docker build –t busybox:test1 . $ sudo docker run busybox:test1 Hello $ sudo docker run busybox:test1 echo Hi Hi
The string of Hi can replace our Dockerfile’s Hello.
ENTRYPOINT:
Dockerfile: FROM busybox:latest ENTRYPOINT [“echo”, “Hello”]
Then we build the image through docker build command. Let’s see the ENTRYPOINT characters.
$ sudo docker build –t busybox:test2 . $ sudo docker run busybox:test2 Hello $ sudo docker run busybox:test2 echo Hi Hello echo Hi
The differences can be seen. ENTRYPOINT can’t be replaced by docker run argument but CMD will be replaced. Then ENTRYPOINT also print strings before the strings of docker run argument.
Syntactically, the ENTRYPOINT instruction is very similar to the RUN and CMD instructions, and it has two types of syntax, as shown here:
The first type of syntax is the shell type, as shown here:
ENTRYPOINT <command>
The second type of syntax is exec or the JSON array, as shown here
ENTRYPOINT ["<exec>", "<arg-1>", ..., "<arg-n>"]
Syntactically, you can have more than one ENTRYPOINT instruction in a Dockerfile. However, the build system will ignore all the ENTRYPOINT instructions except the last one. In other words, in the case of multiple ENTRYPOINT instructions, only the last ENTRYPOINT instruction will be effective.
The ONBUILD instruction
The ONBUILD instruction registers a build instruction to an image and this is triggered when another image is built by using this image as its base image. Therefore, the ONBUILD instruction can be used to defer the execution of the build instruction from the base image to the target image.
The syntax of the ONBUILD instruction is as follows:
ONBUILD <INSTRUCTION>
In addition, it does not allow the FROM and MAINTAINER instructions as ONBUILD triggers. Let’s see an example to know ONBUILD.
Dockerfile1: FROM busybox:latest ONBUILD CMD echo Hello!
Then, we use docker build to create image.
$ sudo docker build –t busybox:test1 . $ sudo docker run busybox:test1
There doesn’t have anything to be printed.
Dockerfile2: FROM busybox:test1 ONBUILD CMD echo Hello!
$sudo docker build –t busybox:test2 . $ sudo docker run busybox:test2 Hello!
The .dockerignore file
The .dockerignore is a newline-separated TEXT file, wherein you can provide the files and the directories which are to be excluded from the build process.
The following snippet is a sample .dockerignore file through which the build system has been instructed to exclude the .git directory and all the files that have the.tmp extension:
.git
*.tmp
A brief overview of the Docker image management
The Docker images are built in layers, that is, the images can be built on top of other images. The original image is called the parent image and the one that is generated is called the child image. Each change that is made to the original image is stored as a separate layer. Each time you commit to a Docker image, you will create a new layer on the Docker image, and each change that is made to the original image will be stored as a separate layer.
The docker history subcommand is an excellent and handy tool for visualizing the image layers.
$ sudo docker history <image.name><image:tag>
Summary
Building the Docker images is a critical aspect of the Docker technology for streamlining the arduous task of containerization. Dockerfile is the most prominent way for producing the competent Docker images, which can be used meticulously. We have illustrated all the commands, their syntax, and their usage techniques in order to empower you with all the easy-to-grasp details, and this will simplify the image-building process for you.
转载请注明出处, O(∩_∩)O谢谢