Kubernetes 集群部署

环境

4 台虚拟机,1 台 master,2 台 node,1 台 utility 是负责提供网关和集群内部通信

master:192.168.19.101/24 主机名:cka-1

node1: 192.168.19.102/24 主机名:cka-2

node2: 192.168.19.103/24 主机名:cka-3

utility: 10.0.0.133/24(NAT 网关,连通外网);192.168.19.254/24(实现集群间通信)

utility 系统是 CentOS 7,k8s 集群系统是 Ubuntu 20.04.2 LTS

utility 是作为客户端远程到 k8s 集群的跳板机和充当网关功能;双网卡能让 k8s 集群之间互相通信,也能实现集群连通外网

准备工作

关闭 swap 交换分区

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
登录到master(cka-1)
[student@utility ~]$ ssh root@cka-1

root@cka-1:~# vim /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda2 during curtin installation
/dev/disk/by-uuid/aabac2dc-9464-4ed0-ae98-37700ee61cf8 / ext4 defaults 0 0
#/swap.img none swap sw 0 0 👈 注释swap一行

查看swap设备的信息
root@cka-1:~# swapon -s
Filename Type Size Used Priority
/swap.img file 4001788 0 -2

禁用swap分区
root@cka-1:~# swapoff /swap.img

再次查看swap信息,此时就没有swap的设备了
root@cka-1:~# swapon -s

> node1 和 node2 同理

路由转发

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
配置系统内核参数
root@cka-1:~# vim /etc/sysctl.d/docker.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

加载模块
root@cka-1:~# modprobe br_netfilter

加载配置文件
root@cka-1:~# sysctl -p /etc/sysctl.d/docker.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

> node1 和 node2 同理

所有节点上安装 Docker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[student@utility ~]$ ssh root@cka-1

root@cka-1:~# apt update
> node1 和 node2 同理

在Ubuntu里docker包名字是docker.io
root@cka-1:~# apt list | grep docker
...
docker.io/focal-updates 20.10.12-0ubuntu2~20.04.1 amd64

k8s集群三台机器都安装docker.io
root@cka-1:~# apt install -y docker.io
> node1 和 node2 同理

查看docker版本
root@cka-1:~# docker --version
Docker version 20.10.12, build 20.10.12-0ubuntu2~20.04.1

启动docker服务并设置开机自启
root@cka-1:~# systemctl start docker & systemctl enable docker
> node1 和 node2 同理

固定docker版本,避免自动升级导致版本不兼容
root@cka-1:~# apt-mark hold docker.io
docker.io set on hold.
> node1 和 node2 同理

扩展,如何安装指定版本 docker

在 Ubuntu 上,找到 docker.io 提供的版本

root@cka-1:# apt-cache madison docker.io
docker.io | 20.10.12-0ubuntu2~20.04.1 | http://mirrors.aliyun.com/ubuntu focal-updates/universe amd64 Packages
docker.io | 20.10.7-0ubuntu5~20.04.2 | http://mirrors.aliyun.com/ubuntu focal-security/universe amd64 Packages

比如我需要安装 20.10.7 版本,只需在包后面指定版本即可;

root@cka-1:# apt install -y docker.io=20.10.7-0ubuntu5~20.04.2

未指定版本后默认安装的都是最新的版本

配置 Docker 仓库地址:阿里加速

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 公有仓库: 
root@cka-1:~# vim /etc/docker/daemon.json
{
"registry-mirrors": ["https://i1pfdcu7.mirror.aliyuncs.com"]
}

node1和node2同理
root@cka-1:~# scp /etc/docker/daemon.json root@cka-2:/etc/docker/
root@cka-1:~# scp /etc/docker/daemon.json root@cka-3:/etc/docker/

重启docker服务
root@cka-1:~# systemctl daemon-reload
root@cka-1:~# systemctl restart docker
> node1 和 node2 同理

检查镜像加速是否生效
root@cka-1:~# docker info
...
Registry Mirrors:
https://i1pfdcu7.mirror.aliyuncs.com/ 👈
Live Restore Enabled: false

拉取镜像(通过阿里CDN加速下载)
root@cka-1:~# docker pull nginx

root@cka-1:~# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nginx latest 605c77e624dd 10 months ago 141MB

安装和部署 Kubernetes 集群

Kubeadm 方式部署(自动化部署,以 pod 方式运行核心组件 ),二进制部署比较复杂,我们重点是如何使用 k8s 集群

文件的源路径是在个人服务器上,源服务器地址如下,不确定服务器的关闭时间,我们提前下载到本地;

k8s-master.tar https://cka-2022-10-22-1252503622.cos.ap-guangzhou.myqcloud.com/k8s-master.tar

k8s-node.tar https://cka-2022-10-22-1252503622.cos.ap-guangzhou.myqcloud.com/k8s-node.tar

calico.tar https://cka-2022-10-22-1252503622.cos.ap-guangzhou.myqcloud.com/calico.tar

1
2
3
4
5
6
'k8s-master.tar、calico.tar'文件上传到master
F:\CKA\CKA-2022实验环境\files>scp k8s-master.tar calico.tar student@192.168.19.101:/home/student

'k8s-node.tar、calico.tar'文件上传到node1和node2
F:\CKA\CKA-2022实验环境\files>scp k8s-node.tar calico.tar student@192.168.19.102:/home/student
F:\CKA\CKA-2022实验环境\files>scp k8s-node.tar calico.tar student@192.168.19.103:/home/student

安装 k8s 集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
- 因为考试时只提供普通用户的账号,所以我们练习时一般也不用root
使用student用户登录master,node1和node2
[root@utility ~]# ssh student@cka-1

三台机器分别导入本地镜像
student@cka-1:~$ sudo docker load -i k8s-master.tar
student@cka-1:~$ sudo docker load -i calico.tar

student@cka-2:~$ sudo docker load -i k8s-node.tar
student@cka-2:~$ sudo docker load -i calico.tar

student@cka-3:~$ sudo docker load -i k8s-node.tar
student@cka-3:~$ sudo docker load -i calico.tar

添加服务管理选项(如果没有添加,在初始化master时连接会被拒绝)
student@cka-1:~$ sudo sed -i "s#^ExecStart=/usr/bin/dockerd.*#ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd#g" /usr/lib/systemd/system/docker.service

student@cka-1:~$ sudo systemctl daemon-reload
student@cka-1:~$ sudo systemctl restart docker
> node1 和 node2 同理

student@cka-1:~$ sudo cat /usr/lib/systemd/system/docker.service
...
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd 👈
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always

在所有节点上添加阿里kubernetes的key
student@cka-1:~$ sudo curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
OK
'curl -s 参数将不输出错误和进度信息;上面命令一旦发生错误,不会显示错误信息,会正常显示运行结果'
> node1 和 node2 同理

在所有节点上添加阿里kuberneter源
student@cka-1:~$ sudo tee /etc/apt/sources.list.d/kubernetes.list <<EOF
> deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
> EOF

查看生成的drop-in文件
student@cka-1:~$ sudo cat /etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
'意思是下载kubernetes时在阿里的镜像站'
> node1 和 node2 同理

更新本地缓存(同步阿里的kubernetes仓库)
student@cka-1:~$ sudo apt update
...
79 packages can be upgraded. Run 'apt list --upgradable' to see them.
> node1 和 node2 同理

查询可以部署的kubeadm版本
student@cka-1:~$ sudo apt-cache madison kubeadm
kubeadm | 1.25.4-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
...
kubeadm | 1.23.0-00 | https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 Packages
...

- k8s集群安装kubeadm,kubelet,kubectl # 部署1.23.0-00版本,三个软件包版本要一致
student@cka-1:~$ sudo apt install -y kubeadm=1.23.0-00 kubelet=1.23.0-00 kubectl=1.23.0-00

###如果出现以下报错,删除/var/lib/dpkg/lock-frontend.文件也不行的话,只能重启解决###
Waiting for cache lock: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process 38Waiting for cache lock: Could not get lock /var/lib/dpkg/lock-frontend. It is held by process

固定kubeadm、kubelet、kubectl的版本
student@cka-1:~$ sudo apt-mark hold kubeadm kubelet kubectl
kubeadm set on hold.
kubelet set on hold.
kubectl set on hold.
> node1 和 node2 同理

在所有节点上配置kubelet开机自启【但不要启动kubelet服务】
student@cka-1:~$ sudo systemctl enable kubelet
> node1 和 node2 同理

部署 k8s 集群

单 master 节点部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
- 初始化master,指定k8s版本'1.23.0',设置控制平面节点(master)IP地址'192.168.19.101',镜像仓库地址'registry.aliyuncs.com/google_containers',svc的网络VIP'10.1.0.0/16',pod的网段'10.244.0.0/16'
student@cka-1:~$ sudo kubeadm init --kubernetes-version=1.23.0 --apiserver-advertise-address=192.168.19.101 --image-repository registry.aliyuncs.com/google_containers --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16
...
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

配置'student'用户为k8s管理员: kubeconfig认证
student@cka-1:~$ sudo mkdir ~/.kube
student@cka-1:~$ sudo cp /etc/kubernetes/admin.conf ~/.kube/config
student@cka-1:~$ sudo chown student:student ~/.kube/config

查看节点
student@cka-1:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
cka-1.example.com NotReady control-plane,master 40m v1.23.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
###如果初始化master报错,是因为没有添加服务管理选项###
...
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

解决 👇
root@cka-1:~# kubeadm reset
root@cka-1:~# reboot

添加服务管理选项
root@cka-1:~# sed -i "s#^ExecStart=/usr/bin/dockerd.*#ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd#g" /usr/lib/systemd/system/docker.service

再进行初始化
student@cka-1:~$ sudo kubeadm init
...

部署客户端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
- 配置utility为kubernetes客户端
student用户登录到utility,配置kubernetes的仓库
[student@utility ~]$ sudo -i
[root@utility ~]# cat > /etc/yum.repos.d/kubernetes.repo <<EOF
> [kubernetes]
> name=Kubernetes
> baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
> enabled=1
> gpgcheck=0
> EOF

查询kubectl客户端的版本,需要与集群的版本一致(1.23.0)
[root@utility ~]# yum list --showduplicates kubeadm --disableexcludes=kubernetes | grep 1.23.0
kubeadm.x86_64 1.23.0-0 kubernetes
[root@utility ~]# yum -y install kubectl-1.23.0-0

此时utility还不能管理k8s集群,因为默认是连接到了本地的8080端口
[root@utility ~]# kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

设置student可以管理k8s集群
[student@utility ~]$ mkdir ~/.kube
[student@utility ~]$ scp student@cka-1:~/.kube/config .kube/
[student@utility ~]$ chmod 0600 .kube/config

此时可以管理节点了
[student@utility ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
cka-1.example.com NotReady control-plane,master 71m v1.23.0

结论:只要你机器上的用户家目录下有 ~/.kube/config 文件,就可以管理k8s集群了;

其中 config 文件是从 /etc/kubernetes/admin.conf 拷贝的

添加 node 节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
- 在master节点上获取token
student@cka-1:~$ sudo -i
root@cka-1:~# kubeadm token create --print-join-command
kubeadm join 192.168.19.101:6443 --token v1tyj2.krhxy0ekjxavlhes --discovery-token-ca-cert-hash sha256:19d6ec1a75bbf962e808fbd64adb4e141d1ec4c95a6b05303f70c8dafbcfc079

- 在node1和node2上将上诉命令结果复制打印即可
添加cka-2作为k8s的worker节点
student@cka-2:~$ sudo kubeadm join 192.168.19.101:6443 --token v1tyj2.krhxy0ekjxavlhes --discovery-token-ca-cert-hash sha256:19d6ec1a75bbf962e808fbd64adb4e141d1ec4c95a6b05303f70c8dafbcfc079

添加cka-3作为k8s的worker节点
student@cka-3:~$ sudo kubeadm join 192.168.19.101:6443 --token v1tyj2.krhxy0ekjxavlhes --discovery-token-ca-cert-hash sha256:19d6ec1a75bbf962e808fbd64adb4e141d1ec4c95a6b05303f70c8dafbcfc079

回到master节点上查看node状态
student@cka-1:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
cka-1.example.com NotReady control-plane,master 85m v1.23.0
cka-2.example.com NotReady <none> 6m20s v1.23.0
cka-3.example.com NotReady <none> 3m14s v1.23.0
# 此时状态NotReady是因为没有配置网络组件,网络不通

上传本地文件到master节点

1
F:\CKA\CKA-2022实验环境\files>scp calico.yaml student@192.168.19.101:~
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
在master节点上通过'calico.yaml'文件部署网络组件
student@cka-1:~$ kubectl create -f calico.yaml

再次查看node的状态
student@cka-1:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
cka-1.example.com Ready control-plane,master 120m v1.23.0 # k8s管理节点
cka-2.example.com Ready <none> 41m v1.23.0 # k8s worker节点,运行生产pod
cka-3.example.com Ready <none> 38m v1.23.0 # k8s worker节点,运行生产pod
# 此时nodes的状态都是Ready了

'kube-system'命名空间的集群pods
student@cka-1:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-68bbcc9985-gmlvq 1/1 Running 0 3m28s
calico-node-hwtf4 1/1 Running 0 3m28s
calico-node-pnp2t 1/1 Running 0 3m28s
calico-node-sqj6j 1/1 Running 0 3m28s
coredns-6d8c4cb4d-8slt7 1/1 Running 0 122m
coredns-6d8c4cb4d-gk4s2 1/1 Running 0 122m
etcd-cka-1.example.com 1/1 Running 0 122m
kube-apiserver-cka-1.example.com 1/1 Running 0 122m
kube-controller-manager-cka-1.example.com 1/1 Running 0 122m
kube-proxy-dspvc 1/1 Running 0 43m
kube-proxy-t5dfl 1/1 Running 0 40m
kube-proxy-trrjp 1/1 Running 0 122m
kube-scheduler-cka-1.example.com 1/1 Running 0 122m
student@cka-1:~$

配置 kubectl 命令补全

提示:CKA 考点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
student@cka-1:~$ sudo apt install bash-completion

环境变量临时生效
student@cka-1:~$ source <(kubectl completion bash)

###开机生效###
student@cka-1:~$ vim ~/.bashrc
student@cka-1:~$ tail ~/.bashrc
# sources /etc/bash.bashrc).
if ! shopt -oq posix; then
if [ -f /usr/share/bash-completion/bash_completion ]; then
. /usr/share/bash-completion/bash_completion
elif [ -f /etc/bash_completion ]; then
. /etc/bash_completion
fi
fi

source <(kubectl completion bash) 👈 将命令写到最后一行

###考试时不要写命令到.bashrc文件###