docker swarm mode
- initializing a cluster of Docker Engines in swarm mode
- adding nodes to swarm
- deploying application services to the swarm
- managing the swarm once you have everything running
what we need:
-
three networked host machines
- manager1
- worker1
- woeker2
-
docker engine 1.12 or later installed
-
the IP address of the manager machine
- ifconfig
- docker-machine ip MACHINE-NAME
- docker-machine ls
-
open ports between the hosts
- TCP port 2377
- TCP and UDP port 7846
- TCP and UDP port 4789 P and UDP port 4789
Create a swarm
- ssh into the machine where you want to run your manager node.
-
Run the following command to create a swarm:
docker swarm init --advertise-addr MANAGER-IPeg:
-
Run
docker infoto view the current state of swarm, it may like: -
Run
docker node lscommand to view information about nodes, it may like:
Add nodes to swarm
- ssh into the machine where you want to add to the swarm.
-
Run the following command to create a worker node joined to the existing swarm:
-
Repeat 1 and 2 step, add another node.
-
ssh into manager node and run the
docker node lscommand to see the worker nodes.
Deploy a service to the swarm
- ssh into your manager node machine.
-
Run the following command:
- The
docker service createcommand creates the service. - the
--replicasflag names the service name like helloworld. - The
--nameflag specifies the desired state of how many running instance like 1. - The arguments
alpine ping docker.comdefine the service as an Alpine Linux container that executes the commandping docekr.com.
- The
-
Run
docker service lsto see the list of running services:
Inspect a service on the swarm
- ssh to your manager node.
- Run
docker service inspect --pretty SERVICE-IDto display the details about a service in an easily readable format. - Run
docker service ps SERVICE-IDto see which nodes are running the service: - Run
docker pson the node where task is running to see details about the container for the task.
Scale the service in the swarm
- ssh into your manager node.
-
Run the following command to change the desired state of the service running in the swarm:
eg:
-
Run
docker ps SERVICE-IDto see the updated task list.
Delete the service running on the swarm
- ssh into your manager node.
-
Run
docker service rm SERVICE-ID, eg: -
Run
docker service inspect SERVICE-IDto verify that the swarm manager removed the service.
Apply rolling updates to a service
Following is to upgrade a service based on Redis 3.0.6 to use the Redis 3.0.7 container image using rolling updates.
- ssh into your manager node.
-
Deploy Redis 3.0.6 to the swarm and configure the swarm with 10 second update delay:
You configure the rolling update policy at service deployment time.
The--update-delayflag configures the time delay between updates to a service task or sets of tasks. You can describe the time T as a combination of the number of seconds Ts, minutes Tm, or hours Th. So 10m30s indicates a 10 minute 30 second delay.
By default the scheduler updates 1 task at a time. You can pass the--update-parallelismflag to configure the maximum number of service tasks that the scheduler updates simultaneously.
By default, when an update to an individual task returns a state of RUNNING, the scheduler schedules another task to update until all tasks are updated. If, at any time during an update a task returns FAILED, the scheduler pauses the update. You can control the behavior using the--update-failure-actionflag fordocker service createordocker service update. -
Inspect the redis service:
-
Update the container image for redis.The swarm manager applies the update to nodes according to the UpdateConfig policy:
The scheduler applies rolling updates as follows by default:
- Stop the first task.
- Schedule update for the stopped task.
- Start the container for the updated task.
- If the update to a task returns RUNNING, wait for the specified delay period then stop the next task.
- If, at any time during the update, a task returns FAILED, pause the update.
-
Run
docker service inspect --pretty redisto see the new image in the desired state:The output of service inspect shows if your update paused due to failure:
To restart a paused update run docker service update SERVICE-ID. For example:
To avoid repeating certain update failures, you may need to reconfigure the service by passing flags to
docker service update. -
Run
docker service ps SERVICE-IDto watch the rolling update:Before Swarm updates all of the tasks, you can see that some are running redis:3.0.6 while others are running redis:3.0.7. The output above shows the state once the rolling updates are done.
Drain a node on the swarm 使 swarm 中的某一个节点不可用
将节点的可用性设置为 DRAIN ,可避免此节点从 swarm manager 哪里接受新的 tasks。同时,manager 会停止运行在此节点上的 tasks 并且在一个可用性为 ACTIVE 的节点上启动被停掉的 tsaks 的副本。
- ssh into your manager node.
-
确认节点们都处于
active的可用性状态。 -
创建
redisservice: -
运行
docker service ps redis命令,确定 manager 将任务分配到了不同的节点。 -
运行
docker node update --availablility drain NODE-ID, drain 一个有被分配 task 的节点: -
运行
docker node inspect --pretty NODE-ID, 检查节点们的 Availability. -
运行
docker service ps redis, 查看 manager 是否有重新分配 redis service 的副本:manager 为了保持 desired state (我的理解是保持创建 service 时所指定的 --replicas 3 的状态),从而在停掉了 availability 被设置为 drain 的节点上的 task 后,又在其他 availability 为 active 的节点上创建了被停掉的 task 。
-
将 avilability 被设置为 drain 的节点重新设置为 active
-
查看更新后节点的状态
当节点被重新设置为 active 之后,他就又能接受新的任务了:
- 在一个 service 被重新设置 scale 之后
- 在进行滚动升级的时候
- 当另外一个节点被设置为 drain 接受其 task 的副本的时候
- 在 task 分配到另一个 active 节点失败的时候