1 背景
K8s扩容:从一节点扩容至三节点,运行在其上的zookeeper也需要扩容,从一节点扩容至三节点,增加高可用。
chart包:bitnami zookeeper 13.7.4
2 问题
扩容之前,zookeeper单节点上是有数据的,但是按照如下的扩容方式,将zookeeper扩容至三节点后,之前的旧数据没有了。
$ helm install zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage"
$ kubectl exec zk-zookeeper-0 -it -- zkCli.sh
/opt/bitnami/java/bin/java
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabledWATCHER::WatchedEvent state:SyncConnected type:None path:null zxid: -1
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] create /fured-test mydata
Created /fured-test
[zk: localhost:2181(CONNECTED) 2] get /fured-test
mydata
[zk: localhost:2181(CONNECTED) 3] get /zookeeper/config[zk: localhost:2181(CONNECTED) 4] quit$ helm upgrade zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=3
$ kubectl exec zk-zookeeper-0 -it -- zkCli.sh
/opt/bitnami/java/bin/java
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabledWATCHER::WatchedEvent state:SyncConnected type:None path:null zxid: -1
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] get /fured-test # 数据丢了
Node does not exist: /fured-test
[zk: localhost:2181(CONNECTED) 2] get /fured-test
Node does not exist: /fured-test
[zk: localhost:2181(CONNECTED) 3] get /zookeeper/config
server.1=zk-zookeeper-0.zk-zookeeper-headless.zk-new-3.svc.cluster.local:2888:3888:participant;0.0.0.0:2181
server.2=zk-zookeeper-1.zk-zookeeper-headless.zk-new-3.svc.cluster.local:2888:3888:participant;0.0.0.0:2181
server.3=zk-zookeeper-2.zk-zookeeper-headless.zk-new-3.svc.cluster.local:2888:3888:participant;0.0.0.0:2181
version=0
[zk: localhost:2181(CONNECTED) 4] quit
3 原因
k8s扩展时,先会启动新节点然后重启有改动的旧节点,当直接将zk从单节点扩展到三节点时会一次启动两个新节点,这时有两个节点的zk集群会直接选出leader节点,旧的单节点重启后直接成了follow节点,而数据同步只能是从leader节点同步到follow节点。所以旧节点始终无法成为leader节点导致其中的数据也没法复制到新启动的zk节点。
$ helm install -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage"
$ helm upgrade -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=3
$ kubectl -n zk-test exec zk-zookeeper-2 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.
command terminated with exit code 1
$ kubectl -n zk-test exec zk-zookeeper-1 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
$ kubectl -n zk-test exec zk-zookeeper-2 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader # 第三个节点直接成leader了
$ kubectl -n zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.
command terminated with exit code 1
$ kubectl -n zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower # 旧节点重启后 直接时follower
4 解决
逐步扩容,将zk先扩成两个节点,再扩成3个节点:扩充两节点时会启动1个新节点,此时无法确立哪个是主节点,等旧节点重启后,旧节点的ZXID更大(因为旧节点有数据所以事务ID更大),所以会将旧节点选举为leader节点,从而将旧数据同步到新节点,第三个节点启动后,会依次重启,之前的第二个和第一个,第二个启动成功后,第二个就是leader(数据会同步给第三个),第一个启动成功后,第三个成了leader
$ helm install -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage"
$ kubectl -n zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: standalone # 单节点模式
$ kubectl -n zk-test exec zk-zookeeper-0 -it -- zkCli.sh
/opt/bitnami/java/bin/java
Connecting to localhost:2181
creWelcome to ZooKeeper!
JLine support is enabled
ate[zk: localhost:2181(CONNECTING) 0]
WATCHER::WatchedEvent state:SyncConnected type:None path:null zxid: -1
create
create [-s] [-e] [-c] [-t ttl] path [data] [acl]
[zk: localhost:2181(CONNECTED) 1] create /fured-test mydata
Created /fured-test
[zk: localhost:2181(CONNECTED) 2] get /fured-test
mydata
$ helm upgrade -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=2
$ kubectl -n zk-test exec zk-zookeeper-1 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower # 新启动的第二个节点为follower
$ kubectl -n zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader # 旧节点成了leader
$ helm upgrade -n zk-test zk zookeeper-13.7.4.tgz --set persistence.storageClass="local-storage" --set replicaCount=3
$ kubectl -n zk-test exec zk-zookeeper-1 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader # 第二个节点成了leader
$ kubectl -n zk-test exec zk-zookeeper-2 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower # 第三个节点是follower
$ kubectl -n zk-test exec zk-zookeeper-0 -it -- zkServer.sh status
/opt/bitnami/java/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Error contacting service. It is probably not running.
command terminated with exit code 1 # 第二个节点成为leader时 第一个节点还在重启
$ kubectl -n zk-test get pod
NAME READY STATUS RESTARTS AGE
zk-zookeeper-0 1/1 Running 0 77s
zk-zookeeper-1 1/1 Running 0 101s
zk-zookeeper-2 1/1 Running 0 2m12s
# 可以从三个pod运行时间看出启动顺序