Hello World

docker swarm mode

  • initializing a cluster of Docker Engines in swarm mode
  • adding nodes to swarm
  • deploying application services to the swarm
  • managing the swarm once you have everything running

what we need:

  • three networked host machines

    • manager1
    • worker1
    • woeker2
  • docker engine 1.12 or later installed

  • the IP address of the manager machine

    • ifconfig
    • docker-machine ip MACHINE-NAME
    • docker-machine ls
  • open ports between the hosts

    • TCP port 2377
    • TCP and UDP port 7846
    • TCP and UDP port 4789 P and UDP port 4789

Create a swarm

  1. ssh into the machine where you want to run your manager node.
  2. Run the following command to create a swarm:

    docker swarm init --advertise-addr MANAGER-IP

    eg:

    $ docker swarm init --advertise-addr 192.168.99.100
    Swarm initialized: current node (dxn1zf6l61qsb1josjja83ngz) is now a manager.
     
    To add a worker to this swarm, run the following command:
     
    docker swarm join \
    --token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c \
    192.168.99.100:2377
     
    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
  3. Run docker info to view the current state of swarm, it may like:

    $ docker info
    Containers: 2
    Running: 0
    Paused: 0
    Stopped: 2
    ...snip...
    Swarm: active
    NodeID: dxn1zf6l61qsb1josjja83ngz
    Is Manager: true
    Managers: 1
    Nodes: 1
    ...snip...
  4. Run docker node ls command to view information about nodes, it may like:

    $ docker node ls
    ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
    dxn1zf6l61qsb1josjja83ngz *  manager1  Ready   Active        Leader

Add nodes to swarm

  1. ssh into the machine where you want to add to the swarm.
  2. Run the following command to create a worker node joined to the existing swarm:

    docekr swarm join \
    --token SWMTKN-1-49nj1cmql0jkz5s954yi3oex3nedyz0fb0xx14ie39trti4wxv-8vxv8rssmk743ojnwacrr2e7c \
    192.168.99.100:2377
  3. Repeat 1 and 2 step, add another node.

  4. ssh into manager node and run the docker node ls command to see the worker nodes.

Deploy a service to the swarm

  1. ssh into your manager node machine.
  2. Run the following command:

    $ docker service create --replicas 1 --name helloworld  alpine ping docker.com
    • The docker service create command creates the service.
    • the --replicas flag names the service name like helloworld.
    • The --name flag specifies the desired state of how many running instance like 1.
    • The arguments alpine ping docker.com define the service as an Alpine Linux container that executes the command ping docekr.com.
  3. Run docker service ls to see the list of running services:

    $ docker service ls

Inspect a service on the swarm

  1. ssh to your manager node.
  2. Run docker service inspect --pretty SERVICE-ID to display the details about a service in an easily readable format.
  3. Run docker service ps SERVICE-ID to see which nodes are running the service:
  4. Run docker ps on the node where task is running to see details about the container for the task.

Scale the service in the swarm

  1. ssh into your manager node.
  2. Run the following command to change the desired state of the service running in the swarm:

    $ docker service scale SERVICE-ID=NUMBER-Of-TASKS

    eg:

    $ docker service scale helloworld=5
    helloworld scaled to 5
  3. Run docker ps SERVICE-ID to see the updated task list.

Delete the service running on the swarm

  1. ssh into your manager node.
  2. Run docker service rm SERVICE-ID, eg:

    $ docker service rm helloworld
  3. Run docker service inspect SERVICE-ID to verify that the swarm manager removed the service.

    $ docker service inspect helloworld
    []
    Error: no such service: helloworld

Apply rolling updates to a service

Following is to upgrade a service based on Redis 3.0.6 to use the Redis 3.0.7 container image using rolling updates.

  1. ssh into your manager node.
  2. Deploy Redis 3.0.6 to the swarm and configure the swarm with 10 second update delay:

    $ docker service create \
    --replicas 3 \
    --name redis \
    --update-delay 10s \
    redis:3.0.6

    You configure the rolling update policy at service deployment time.
    The --update-delay flag configures the time delay between updates to a service task or sets of tasks. You can describe the time T as a combination of the number of seconds Ts, minutes Tm, or hours Th. So 10m30s indicates a 10 minute 30 second delay.
    By default the scheduler updates 1 task at a time. You can pass the --update-parallelism flag to configure the maximum number of service tasks that the scheduler updates simultaneously.
    By default, when an update to an individual task returns a state of RUNNING, the scheduler schedules another task to update until all tasks are updated. If, at any time during an update a task returns FAILED, the scheduler pauses the update. You can control the behavior using the --update-failure-action flag for docker service create or docker service update.

  3. Inspect the redis service:

    $ docker service inspect --pretty redis
    ID:             0u6a4s31ybk7yw2wyvtikmu50
    Name:           redis
    Mode:           Replicated
    Replicas:      3
    Placement:
    Strategy:        Spread
    UpdateConfig:
    Parallelism:   1
    Delay:         10s
    ContainerSpec:
    Image:         redis:3.0.6
    Resources:
  4. Update the container image for redis.The swarm manager applies the update to nodes according to the UpdateConfig policy:

    $ docker service update --image redis:3.0.7 redis

    The scheduler applies rolling updates as follows by default:

    • Stop the first task.
    • Schedule update for the stopped task.
    • Start the container for the updated task.
    • If the update to a task returns RUNNING, wait for the specified delay period then stop the next task.
    • If, at any time during the update, a task returns FAILED, pause the update.
  5. Run docker service inspect --pretty redis to see the new image in the desired state:

    $ docker service inspect --pretty redis
    ID:             0u6a4s31ybk7yw2wyvtikmu50
    Name:           redis
    Mode:           Replicated
    Replicas:      3
    Placement:
    Strategy:        Spread
    UpdateConfig:
    Parallelism:   1
    Delay:         10s
    ContainerSpec:
    Image:         redis:3.0.7
    Resources:

    The output of service inspect shows if your update paused due to failure:

    $ docker service inspect --pretty redis
    ID:             0u6a4s31ybk7yw2wyvtikmu50
    Name:           redis
    ...snip...
    Update status:
    State:      paused
    Started:    11 seconds ago
    Message:    update paused due to failure or early termination of task 9p7ith557h8ndf0ui9s0q951b
    ...snip...

    To restart a paused update run docker service update SERVICE-ID. For example:

    docker service update redis

    To avoid repeating certain update failures, you may need to reconfigure the service by passing flags to docker service update.

  6. Run docker service ps SERVICE-ID to watch the rolling update:

    $ docker service ps redis
    ID                         NAME         IMAGE        NODE       DESIRED STATE  CURRENT STATE            ERROR
    dos1zffgeofhagnve8w864fco  redis.1      redis:3.0.7  worker1    Running        Running 37 seconds
    88rdo6pa52ki8oqx6dogf04fh   \_ redis.1  redis:3.0.6  worker2    Shutdown       Shutdown 56 seconds ago
    9l3i4j85517skba5o7tn5m8g0  redis.2      redis:3.0.7  worker2    Running        Running About a minute
    66k185wilg8ele7ntu8f6nj6i   \_ redis.2  redis:3.0.6  worker1    Shutdown       Shutdown 2 minutes ago
    egiuiqpzrdbxks3wxgn8qib1g  redis.3      redis:3.0.7  worker1    Running        Running 48 seconds
    ctzktfddb2tepkr45qcmqln04   \_ redis.3  redis:3.0.6  mmanager1  Shutdown       Shutdown 2 minutes ago

    Before Swarm updates all of the tasks, you can see that some are running redis:3.0.6 while others are running redis:3.0.7. The output above shows the state once the rolling updates are done.

Drain a node on the swarm 使 swarm 中的某一个节点不可用

将节点的可用性设置为 DRAIN ,可避免此节点从 swarm manager 哪里接受新的 tasks。同时,manager 会停止运行在此节点上的 tasks 并且在一个可用性为 ACTIVE 的节点上启动被停掉的 tsaks 的副本。

  1. ssh into your manager node.
  2. 确认节点们都处于 active 的可用性状态。

    $ docker node ls
    ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
    1bcef6utixb0l0ca7gxuivsj0    worker2   Ready   Active
    38ciaotwjuritcdtn9npbnkuz    worker1   Ready   Active
    e216jshn25ckzbvmwlnh5jr3g *  manager1  Ready   Active        Leader
  3. 创建 redis service:

    $ docker service create --replicas 3 --name redis --update-delay 10s redis:3.0.6
  4. 运行 docker service ps redis 命令,确定 manager 将任务分配到了不同的节点。

  5. 运行 docker node update --availablility drain NODE-ID , drain 一个有被分配 task 的节点:

    docker node update --availability drain worker1
  6. 运行 docker node inspect --pretty NODE-ID , 检查节点们的 Availability.

  7. 运行 docker service ps redis , 查看 manager 是否有重新分配 redis service 的副本:

    $ docker service ps redis
    ID                         NAME          IMAGE        NODE      DESIRED STATE  CURRENT STATE           ERROR
    7q92v0nr1hcgts2amcjyqg3pq  redis.1       redis:3.0.6  manager1  Running        Running 4 minutes
    b4hovzed7id8irg1to42egue8  redis.2       redis:3.0.6  worker2   Running        Running About a minute
    7h2l8h3q3wqy5f66hlv9ddmi6   \_ redis.2   redis:3.0.6  worker1   Shutdown       Shutdown 2 minutes ago
    9bg7cezvedmkgg6c8yzvbhwsd  redis.3       redis:3.0.6  worker2   Running        Running 4 minutes

    manager 为了保持 desired state (我的理解是保持创建 service 时所指定的 --replicas 3 的状态),从而在停掉了 availability 被设置为 drain 的节点上的 task 后,又在其他 availability 为 active 的节点上创建了被停掉的 task 。

  8. 将 avilability 被设置为 drain 的节点重新设置为 active

    $ docker node update --availability active worker1
  9. 查看更新后节点的状态

    $ docker node inspect --pretty worker1
    ID:            38ciaotwjuritcdtn9npbnkuz
    Hostname:        worker1
    Status:
    State:            Ready
    Availability:        Active
    ...snip...

    当节点被重新设置为 active 之后,他就又能接受新的任务了:

    • 在一个 service 被重新设置 scale 之后
    • 在进行滚动升级的时候
    • 当另外一个节点被设置为 drain 接受其 task 的副本的时候
    • 在 task 分配到另一个 active 节点失败的时候