我们都知道Docker对内核是有要求的,版本越新的内核支持越好,这是为什么呢?
因为Docker用到了内核中两个比较重要的特性,namespace和cgroups。

docker本质只是系统的上的另一个进程,如果在Docker诞生之前,用chroot是可以实现类似功能的
,安装过arch Linux的人肯定有印象(其他Linux的安装也是类似),但是两个系统同时运行在一台主机,
如果不加上相应的隔离的话,就会互相影响的,想想虚拟机之间也是有强隔离性的,而namespace就是内核实现的隔离。

cgroups(Control Groups)最初叫Process Container,由Google工程师(Paul Menage和Rohit Seth)
于2006年提出,后来因为Container有多重含义容易引起误解,就在2007年更名为Control Groups,
并被整合进Linux内核。顾名思义就是把进程放到一个组里面统一加以控制。
官方的定义如下{![引自:https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt]}。

之前的namespace控制了隔离,但是如果不限制CPU,内存等,主机的多个系统就会对有限的资源进行竞争,于是 cgroups进程控制就业诞生了。

infoQ的这两篇文章介绍的更详细
http://www.infoq.com/cn/articles/docker-kernel-knowledge-namespace-resource-isolation
http://www.infoq.com/cn/articles/docker-kernel-knowledge-namespace-resource-isolation
https://coolshell.cn/articles/17061.html

Docker的运行有这个关键的内核特性就够了,但是为了方便的部署和打包,
Docker使用了一种名为unfs的层级文件系统,支持copy on write。当镜像被 docker run
命令创建时就会在镜像的最上层添加一个可写的层,也就是容器层,
所有对于运行时容器的修改其实都是对这个容器读写层的修改。

最早的Docker是只支持Ubuntu的,应为aufs只支持Ubuntu,不过后来Docker还支持了不同的存储驱动,
包括devicemapper、overlay2、zfs 和 vfs 等等。

Docker 架构
在docker 1.11之前dockers是只有一个二进制的,daemon和cli在一起。为了避免一家独大的情况。
opencontainers(OCI)成立了,拆分了Docker的二进制,制定了容器相关规范,
可能就在这个时候coreos 的 rkt也诞生了,另一套docker的runtime实现。

1.jpg

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
docker-containerd -h
NAME:
containerd -
__ _ __
_________ ____ / /_____ _(_)___ ___ _________/ /
/ ___/ __ \/ __ \/ __/ __ `/ / __ \/ _ \/ ___/ __ /
/ /__/ /_/ / / / / /_/ /_/ / / / / / __/ / / /_/ /
\___/\____/_/ /_/\__/\__,_/_/_/ /_/\___/_/ \__,_/
high performance container runtime
USAGE:
docker-containerd [global options] command [command options] [arguments...]
VERSION:
v1.0.2
COMMANDS:
config information on the containerd config
publish binary to publish events to containerd
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--config value, -c value path to the configuration file (default: "/etc/containerd/config.toml")
--log-level value, -l value set the logging level [debug, info, warn, error, fatal, panic]
--address value, -a value address for containerd's GRPC server
--root value containerd root directory
--state value containerd state directory
--help, -h show help
--version, -v print the version
[root@k8s-master ~]# docker-containerd-
docker-containerd-ctr docker-containerd-shim
[root@k8s-master ~]# docker-containerd-ctr -h
NAME:
ctr -
__
_____/ /______
/ ___/ __/ ___/
/ /__/ /_/ /
\___/\__/_/
containerd CLI
USAGE:
docker-containerd-ctr [global options] command [command options] [arguments...]
VERSION:
v1.0.2
COMMANDS:
plugins, plugin provides information about containerd plugins
version print the client and server versions
containers, c, container manage containers
content manage content
events, event display containerd events
images, image manage images
namespaces, namespace manage namespaces
pprof provide golang pprof outputs for containerd
run run a container
snapshots, snapshot manage snapshots
tasks, t, task manage tasks
shim interact with a shim directly
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--debug enable debug output in logs
--address value, -a value address for containerd's GRPC server (default: "/run/containerd/containerd.sock")
--timeout value total timeout for ctr commands (default: 0s)
--connect-timeout value timeout for connecting to containerd (default: 0s)
--namespace value, -n value namespace to use with commands (default: "default") [$CONTAINERD_NAMESPACE]
--help, -h show help
--version, -v print the version
[root@k8s-master ~]# docker-containerd
docker-containerd docker-containerd-ctr docker-containerd-shim
[root@k8s-master ~]# docker-containerd-shim -h
Usage of docker-containerd-shim:
-address string
grpc address back to main containerd
-containerd-binary containerd publish
path to containerd binary (used for containerd publish) (default "containerd")
-criu string
path to criu binary
-debug
enable debug output in logs
-namespace string
namespace that owns the shim
-runtime-root string
root directory for the runtime (default "/run/containerd/runc")
-socket string
abstract socket path to serve
-systemd-cgroup
set runtime to use systemd-cgroup
-workdir string
path used to storge large temporary data

docker run -d vsxen/k8s sleep 1d

1
2
3
4
5
6
7
8
9
pstree -l -a -A 20708
dockerd --debug
|-docker-containe --config /var/run/docker/containerd/containerd.toml
| |-docker-containe -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/3ba24bd7565ac01d5dc1b35ac1f67b3d150b77bf7358d017a282efaa38459aa9 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /root/local/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc -debug
| | |-sleep 1d
| | `-8*[{docker-containe}]
| `-10*[{docker-containe}]
`-10*[{dockerd}]

当Docker daemon启动之后,dockerd和docker-containerd进程一直存在。当启动容器之后,docker-containerd进程(也是这里介绍的containerd组件)会创建docker-containerd-shim进程,其中的参数b9a04a582b66206492d29444b5b7bc6ec9cf1eb83eff580fe43a039ad556e223就是要启动容器的id。最后docker-containerd-shim子进程,已经是实际在容器中运行的进程(既sleep 1000)。

docker-containerd-shim另一个参数,是一个和容器相关的目录/var/run/docker/libcontainerd/b9a04a582b66206492d29444b5b7bc6ec9cf1eb83eff580fe43a039ad556e223,里面的内容有:

.
├── config.json
├── init-stderr
├── init-stdin
└── init-stdout

参考
https://draveness.me/docker
https://segmentfault.com/a/1190000011294361
http://www.infoq.com/cn/news/2017/02/Docker-Containerd-RunC
https://blog.csdn.net/u013812710/article/details/79001463