欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 新闻 > 资讯 > 记一次:K8s Zookeeper扩容到3节点时丢数据问题

记一次:K8s Zookeeper扩容到3节点时丢数据问题

2025/4/3 11:45:15 来源:https://blog.csdn.net/fured/article/details/146802789  浏览:    关键词:记一次:K8s Zookeeper扩容到3节点时丢数据问题

1 背景

K8s扩容:从一节点扩容至三节点,运行在其上的zookeeper也需要扩容,从一节点扩容至三节点,增加高可用。

chart包:bitnami zookeeper 13.7.4

2 问题

扩容之前,zookeeper单节点上是有数据的,但是按照如下的扩容方式,将zookeeper扩容至三节点后,之前的旧数据没有了。

$ helm install zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage"
$ kubectl exec zk-zookeeper-0 -it -- zkCli.sh
/opt/bitnami/java/bin/java
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabledWATCHER::WatchedEvent state:SyncConnected type:None path:null zxid: -1
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] create /fured-test mydata
Created /fured-test
[zk: localhost:2181(CONNECTED) 2] get /fured-test
mydata
[zk: localhost:2181(CONNECTED) 3] get /zookeeper/config[zk: localhost:2181(CONNECTED) 4] quit$ helm upgrade zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=3
$ kubectl exec zk-zookeeper-0 -it -- zkCli.sh
/opt/bitnami/java/bin/java
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabledWATCHER::WatchedEvent state:SyncConnected type:None path:null zxid: -1
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] get /fured-test   # 数据丢了
Node does not exist: /fured-test
[zk: localhost:2181(CONNECTED) 2] get /fured-test
Node does not exist: /fured-test
[zk: localhost:2181(CONNECTED) 3] get /zookeeper/config
server.1=zk-zookeeper-0.zk-zookeeper-headless.zk-new-3.svc.cluster.local:2888:3888:participant;0.0.0.0:2181
server.2=zk-zookeeper-1.zk-zookeeper-headless.zk-new-3.svc.cluster.local:2888:3888:participant;0.0.0.0:2181
server.3=zk-zookeeper-2.zk-zookeeper-headless.zk-new-3.svc.cluster.local:2888:3888:participant;0.0.0.0:2181
version=0
[zk: localhost:2181(CONNECTED) 4] quit

3 原因

k8s扩展时,先会启动新节点然后重启有改动的旧节点,当直接将zk从单节点扩展到三节点时会一次启动两个新节点,这时有两个节点的zk集群会直接选出leader节点,旧的单节点重启后直接成了follow节点,而数据同步只能是从leader节点同步到follow节点。所以旧节点始终无法成为leader节点导致其中的数据也没法复制到新启动的zk节点。

$ helm install -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage"
$ helm upgrade -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=3
$ kubectl -n  zk-test exec zk-zookeeper-2 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.
command terminated with exit code 1
$ kubectl -n  zk-test exec zk-zookeeper-1 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
$ kubectl -n  zk-test exec zk-zookeeper-2 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader   # 第三个节点直接成leader了
$ kubectl -n  zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.
command terminated with exit code 1
$ kubectl -n  zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower  # 旧节点重启后 直接时follower

4 解决

逐步扩容,将zk先扩成两个节点,再扩成3个节点:扩充两节点时会启动1个新节点,此时无法确立哪个是主节点,等旧节点重启后,旧节点的ZXID更大(因为旧节点有数据所以事务ID更大),所以会将旧节点选举为leader节点,从而将旧数据同步到新节点,第三个节点启动后,会依次重启,之前的第二个和第一个,第二个启动成功后,第二个就是leader(数据会同步给第三个),第一个启动成功后,第三个成了leader

$ helm install -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage"
$ kubectl -n  zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: standalone  # 单节点模式
$ kubectl -n zk-test exec zk-zookeeper-0 -it -- zkCli.sh
/opt/bitnami/java/bin/java
Connecting to localhost:2181
creWelcome to ZooKeeper!
JLine support is enabled
ate[zk: localhost:2181(CONNECTING) 0]
WATCHER::WatchedEvent state:SyncConnected type:None path:null zxid: -1
create
create [-s] [-e] [-c] [-t ttl] path [data] [acl]
[zk: localhost:2181(CONNECTED) 1] create /fured-test mydata
Created /fured-test
[zk: localhost:2181(CONNECTED) 2] get /fured-test
mydata
$ helm upgrade -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=2
$ kubectl -n  zk-test exec zk-zookeeper-1 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower  # 新启动的第二个节点为follower
$ kubectl -n  zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader  # 旧节点成了leader
$ helm upgrade -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=3
$ kubectl -n  zk-test exec zk-zookeeper-1 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader  # 第二个节点成了leader
$ kubectl -n  zk-test exec zk-zookeeper-2 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower  # 第三个节点是follower
$ kubectl -n  zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.
command terminated with exit code 1  # 第二个节点成为leader时 第一个节点还在重启
$ kubectl -n zk-test get pod
NAME             READY   STATUS    RESTARTS   AGE
zk-zookeeper-0   1/1     Running   0          77s
zk-zookeeper-1   1/1     Running   0          101s
zk-zookeeper-2   1/1     Running   0          2m12s
# 可以从三个pod运行时间看出启动顺序

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

热搜词