大致步骤
- 生成
etcd
证书
cert-file
和key-file
用于etcd服务器与客户端之间的通信,而peer-cert-file
和peer-key-file
用于集群内部etcd节点间的相互通信。只要证书是由信任的 CA 签发,并且etcd服务器配置为信任该CA(通过
--trusted-ca-file
参数指定),那么任何持有由该CA签发的有效客户端证书的客户端都可以与etcd服务器建立安全通信。
- 恢复etcd数据库
- 启动etcd集群
- 其他组件恢复
生成etcd
证书
- 编写证书请求文件
csr.conf
[ req ]
default_bits = 2048
prompt = no
default_md = sha256
distinguished_name = dn
req_extensions = req_ext
[ dn ]
CN = etcd-cluster
[ req_ext ]
subjectAltName = @alt_names
[ alt_names ]
DNS.1 = etcd-cluster
DNS.2 = k8s-master-01
DNS.3 = k8s-master-02
DNS.4 = k8s-master-03
DNS.5 = localhost
IP.1 = 192.168.99.7
IP.2 = 192.168.99.8
IP.3 = 192.168.99.9
IP.4 = 127.0.0.1
- 生成一个新的CSR(包括生成一对新的密钥,即公钥和私钥),并且不加密私钥,输出CSR到指定文件,并根据提供的配置文件生成。
openssl req -new -nodes -text -out etcd-cluster.csr -config csr.conf -keyout etcd-cluster.key
- 查看证书签名请求(CSR)的详细信息
openssl req -in etcd-cluster.csr -noout -text
- 使用CA的证书和私钥对CSR进行签名,生成一个新的证书
etcd-cluster.crt
,有效期为2000天,包含csr.conf
中指定的扩展。
openssl x509 -req -in etcd-cluster.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out etcd-cluster.crt -days 2000 -extensions req_ext -extfile csr.conf
恢复etcd数据库
清空3个节点
/var/lib/etcd/
目录,分别在3个节点恢复数据库,以master02.db
的快照为准。<new_cluster_token>
是新集群的唯一标识符,任何字符串都可以,只要它在集群中是唯一的。<new_cluster_token>
仅在集群初始化或从快照恢复期间使用,之后对集群的运行没有影响。
etcdctl snapshot restore /root/master02.db\
--name k8s-master-01 \
--initial-cluster k8s-master-01=https://192.168.99.7:2380,k8s-master-02=https://192.168.99.8:2380,k8s-master-03=https://192.168.99.9:2380 \
--initial-cluster-token new-cluster-20240329 \
--initial-advertise-peer-urls https://192.168.99.7:2380 \
--data-dir /var/lib/etcd/
etcdctl snapshot restore /root/master02.db\
--name k8s-master-02 \
--initial-cluster k8s-master-01=https://192.168.99.7:2380,k8s-master-02=https://192.168.99.8:2380,k8s-master-03=https://192.168.99.9:2380 \
--initial-cluster-token new-cluster-20240329 \
--initial-advertise-peer-urls https://192.168.99.8:2380 \
--data-dir /var/lib/etcd/
etcdctl snapshot restore /root/master02.db\
--name k8s-master-03 \
--initial-cluster k8s-master-01=https://192.168.99.7:2380,k8s-master-02=https://192.168.99.8:2380,k8s-master-03=https://192.168.99.9:2380 \
--initial-cluster-token new-cluster-20240329 \
--initial-advertise-peer-urls https://192.168.99.9:2380 \
--data-dir /var/lib/etcd/
启动etcd集群
如果你在三个节点上分别使用
etcdctl snapshot restore
命令恢复了etcd数据,并且在恢复时为每个节点指定了其在集群中的角色和配置(通过--name
和--initial-cluster
参数),实际上你已经预配置了整个集群的成员信息。在这种情况下,理论上,你不需要使用etcdctl member add
命令来手动添加成员到集群,因为每个节点的集群成员信息已经通过恢复过程被配置好了。
编辑每个节点etcd.yaml
文件,编辑后挪动至/etc/kubernetes/manifests
中
更新以下参数:
--name
--initial-cluster
--cert-file
--key-file
--peer-cert-file
--peer-key-file
--peer-trusted-ca-file
--trusted-ca-file
containers:
- command:
- etcd
- --advertise-client-urls=https://192.168.99.7:2379
- --cert-file=/etc/kubernetes/pki/etcd/etcd-cluster.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --initial-advertise-peer-urls=https://192.168.99.7:2380
- --initial-cluster=k8s-master-01=https://192.168.99.7:2380,k8s-master-02=https://192.168.99.8:2380,k8s-master-03=https://192.168.99.9:2380
- --key-file=/etc/kubernetes/pki/etcd/etcd-cluster.key
- --listen-client-urls=https://127.0.0.1:2379,https://192.168.99.7:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://192.168.99.7:2380
- --name=k8s-master-01
- --peer-cert-file=/etc/kubernetes/pki/etcd/etcd-cluster.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/etcd-cluster.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --quota-backend-bytes=6294967296
检查etcd集群状态
etcdctl --endpoints=192.168.99.7:2379,192.168.99.8:2379,192.168.99.9:2379 --cacert="/etc/kubernetes/pki/etcd/ca.crt" --cert="/etc/kubernetes/pki/etcd/etcd-cluster.crt" --key="/etc/kubernetes/pki/etcd/etcd-cluster.key" endpoint status -w table
其他组件恢复
kubeadm certs renew all
更新相关组件证书。
从/etc/kubernetes/manifests
挪出下列文件,如需要则更新相关证书配置,更新完后挪回至原处
kube-apiserver.yaml
kube-controller-manager.yaml
kube-scheduler.yaml