问题现场
查看kubelet日志
journalctl -xefu kubelet
#异常,error: cni plugin not initialized
Nov 07 16:12:56 VM-0-5-centos kubelet[2278204]: E1107 16:12:56.747955 2278204 kubelet.go:2855] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
分析原因
kubectl apply -f kube-flannel.yml
/etc/cni/net.d/10-flannel.conflist 这个文件有时候初始化有问题
第一步:清理flannel网络留下的文件(仅node节点)
ifconfig cni0 downip link delete cni0ifconfig flannel.1 downip link delete flannel.1rm -rf /var/lib/cni/rm -f /etc/cni/net.d/*
注:执行完上面的操作,重启kubelet
ifconfig cni0 downip link delete cni0
ifconfig vethb22xxxxx down //(只复制@前面的串就行)
ip link delete vethb22xxxxx
第二步:修复/etc/cni/net.d/10-flannel.conflist 这个文件有时候初始化问题(仅node节点)
cat <<EOL > /etc/cni/net.d/10-flannel.conflist
{"name": "cbr0","cniVersion": "0.3.1","plugins": [{"type": "flannel","delegate": {"hairpinMode": true,"isDefaultGateway": true}},{"type": "portmap","capabilities": {"portMappings": true}}]
}
EOL
查看conflist
cat /etc/cni/net.d/10-flannel.conflistifconfig cni0
第三步:修改containerd的镜像endpoint (仅node节点)
编辑vim /etc/crictl.yaml
编辑/etc/crictl.yaml文件, 修改, 主要是新版本增加的image-endpoint runtime-endpoint: "unix:///run/containerd/containerd.sock"
image-endpoint: "unix:///run/containerd/containerd.sock" #与上边runtime-endpoint一致即可
timeout: 10
debug: false
pull-image-on-create: false
disable-pull-on-run: false第四步:配置 containerd cgroup 驱动程序 systemd(所有节点)
kubernets 自v 1.24.0 后,就不再使用 docker.shim,替换采用 containerd 作为容器运行时端点。因此需要安装 containerd(在 docker 的基础下安装),上面安装 docker 的时候就自动安装了 containerd 了。这里的 docker 只是作为客户端而已。容器引擎还是 containerd。
cat /etc/containerd/config.toml | grep -n "SystemdCgroup"sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
如果该文件没有,则需要生成一下
containerd config default > /etc/containerd/config.toml
查看 sandbox 的默认镜像仓库在文件中的第几行
cat /etc/containerd/config.toml | grep -n "sandbox_image"
使用 vim 编辑器 定位到 sandbox_image,将 仓库地址修改成 registry.aliyuncs.com/google_containers/pause:3.6
vim /etc/containerd/config.tomlsandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"
第四步:重新启动containerd和kubelet 服务(仅node节点)
systemctl daemon-reloadsystemctl restart containerdsystemctl restart kubeletcrictl image
第五步:加入节点
kubeadm join 123.12.0.23:6443 --token nacoen.xxxxxxxxxxx
--discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#以前加入过,有异常
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
直接删除
rm -f /etc/kubernetes/kubelet.confrm -f /etc/kubernetes/pki/ca.crt
将主节点的.kube目录复制过来,再重新加入
加入超时
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
swapoff -a # will turn off the swapkubeadm resetsystemctl daemon-reloadsystemctl restart kubelet
然后重新执行加入命名