当前位置:网站首页>Kubernetes 选举机制HA

Kubernetes 选举机制HA

2022-08-10 22:59:00 InfoQ

在 Kubernetes 的 
kube-controller-manager
 , 
kube-scheduler
, 以及使用 
Operator
 的底层实现 
controller-rumtime
 都支持高可用系统中的 leader 选举,本文将以理解 
controller-rumtime
 (底层的实现是 
client-go
) 中的 leader 选举以在 kubernetes controller 中是如何实现的。

Background

在运行 
kube-controller-manager
 时,是有一些参数提供给 cm 进行 leader 选举使用的,可以参考官方文档提供的 
参数
[1]
 来了解相关参数。
--leader-elect                               Default: true
--leader-elect-renew-deadline duration       Default: 10s
--leader-elect-resource-lock string          Default: "leases"
--leader-elect-resource-name string       Default: "kube-controller-manager"
--leader-elect-resource-namespace string     Default: "kube-system"
--leader-elect-retry-period duration         Default: 2s
...
本身以为这些组件的选举动作时通过 etcd 进行的,但是后面对 
controller-runtime
 学习时,发现并没有配置其相关的 etcd 相关参数,这就引起了对选举机制的好奇。怀着这种好奇心搜索了下有关于 kubernetes 的选举,发现官网是这么介绍的,下面是对官方的说明进行一个通俗总结。
simple leader election with kubernetes
[2]
*
通过阅读文章得知,kubernetes API 提供了一中选举机制,只要运行在集群内的容器,都是可以实现选举功能的。
Kubernetes API 通过提供了两个属性来完成选举动作的
  • ResourceVersions:每个 API 对象唯一一个 ResourceVersion
  • Annotations:每个 API 对象都可以对这些 key 进行注释
注:这种选举会增加 APIServer 的压力。也就对 etcd 会产生影响
那么有了这些信息之后,我们来看一下,在 Kubernetes 集群中,谁是 cm 的 leader(我们提供的集群只有一个节点,所以本节点就是 leader)。

在 Kubernetes 中所有启用了 leader 选举的服务都会生成一个 
EndPoint
 ,在这个 
EndPoint
 中会有上面提到的 label(
Annotations
)来标识谁是 leader。
$ kubectl get ep -n kube-system
NAME                      ENDPOINTS   AGE
kube-controller-manager&nbsp;&nbsp;&nbsp;<none>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3d4h
kube-dns&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3d4h
kube-scheduler&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<none>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3d4h
这里以&nbsp;
kube-controller-manager
&nbsp;为例,来看下这个&nbsp;
EndPoint
&nbsp;有什么信息
[[email protected]&nbsp;~]#&nbsp;kubectl&nbsp;describe&nbsp;ep&nbsp;kube-controller-manager&nbsp;-n&nbsp;kube-system
Name:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;kube-controller-manager
Namespace:&nbsp;&nbsp;&nbsp;&nbsp;kube-system
Labels:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<none>
Annotations:&nbsp;&nbsp;control-plane.alpha.kubernetes.io/leader:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&quot;holderIdentity&quot;:&quot;master-machine_06730140-a503-487d-850b-1fe1619f1fe1&quot;,&quot;leaseDurationSeconds&quot;:15,&quot;acquireTime&quot;:&quot;2022-06-27T15:30:46Z&quot;,&quot;re...
Subsets:
Events:
&nbsp;&nbsp;Type&nbsp;&nbsp;&nbsp;&nbsp;Reason&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Age&nbsp;&nbsp;&nbsp;&nbsp;From&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Message
&nbsp;&nbsp;----&nbsp;&nbsp;&nbsp;&nbsp;------&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;----&nbsp;&nbsp;&nbsp;----&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-------
&nbsp;&nbsp;Normal&nbsp;&nbsp;LeaderElection&nbsp;&nbsp;2d22h&nbsp;&nbsp;kube-controller-manager&nbsp;&nbsp;master-machine_76aabcb5-49ff-45ff-bd18-4afa61fbc5af&nbsp;became&nbsp;leader
&nbsp;&nbsp;Normal&nbsp;&nbsp;LeaderElection&nbsp;&nbsp;9m&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;kube-controller-manager&nbsp;&nbsp;master-machine_06730140-a503-487d-850b-1fe1619f1fe1&nbsp;became&nbsp;leader
可以看出&nbsp;
Annotations: control-plane.alpha.kubernetes.io/leader:
&nbsp;标出了哪个 node 是 leader。

election in controller-runtime

controller-runtime
&nbsp;有关 leader 选举的部分在&nbsp;
pkg/leaderelection
[3]
&nbsp;下面,总共 100 行代码,我们来看下做了些什么?
可以看到,这里只提供了创建资源锁的一些选项
type&nbsp;Options&nbsp;struct&nbsp;{
&nbsp;//&nbsp;在manager启动时,决定是否进行选举
&nbsp;LeaderElection&nbsp;bool
&nbsp;//&nbsp;使用那种资源锁&nbsp;默认为租用&nbsp;lease
&nbsp;LeaderElectionResourceLock&nbsp;string
&nbsp;//&nbsp;选举发生的名称空间
&nbsp;LeaderElectionNamespace&nbsp;string
&nbsp;//&nbsp;该属性将决定持有leader锁资源的名称
&nbsp;LeaderElectionID&nbsp;string
}
通过&nbsp;
NewResourceLock
&nbsp;可以看到,这里是走的&nbsp;
client-go/tools/leaderelection
[4]
下面,而这个 leaderelection 也有一个&nbsp;
example
[5]
&nbsp;来学习如何使用它。
通过 example 可以看到,进入选举的入口是一个 RunOrDie() 的函数
//&nbsp;这里使用了一个lease锁,注释中说愿意为集群中存在lease的监听较少
lock&nbsp;:=&nbsp;&resourcelock.LeaseLock{
&nbsp;&nbsp;&nbsp;&nbsp;LeaseMeta:&nbsp;metav1.ObjectMeta{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Name:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaseLockName,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Namespace:&nbsp;leaseLockNamespace,
&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;Client:&nbsp;client.CoordinationV1(),
&nbsp;&nbsp;&nbsp;&nbsp;LockConfig:&nbsp;resourcelock.ResourceLockConfig{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Identity:&nbsp;id,
&nbsp;&nbsp;&nbsp;&nbsp;},
}

//&nbsp;开启选举循环
leaderelection.RunOrDie(ctx,&nbsp;leaderelection.LeaderElectionConfig{
&nbsp;&nbsp;&nbsp;&nbsp;Lock:&nbsp;lock,
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;这里必须保证拥有的租约在调用cancel()前终止,否则会仍有一个loop在运行
&nbsp;&nbsp;&nbsp;&nbsp;ReleaseOnCancel:&nbsp;true,
&nbsp;&nbsp;&nbsp;&nbsp;LeaseDuration:&nbsp;&nbsp;&nbsp;60&nbsp;*&nbsp;time.Second,
&nbsp;&nbsp;&nbsp;&nbsp;RenewDeadline:&nbsp;&nbsp;&nbsp;15&nbsp;*&nbsp;time.Second,
&nbsp;&nbsp;&nbsp;&nbsp;RetryPeriod:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5&nbsp;*&nbsp;time.Second,
&nbsp;&nbsp;&nbsp;&nbsp;Callbacks:&nbsp;leaderelection.LeaderCallbacks{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;OnStartedLeading:&nbsp;func(ctx&nbsp;context.Context)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;这里填写你的代码,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;usually&nbsp;put&nbsp;your&nbsp;code
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;run(ctx)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;OnStoppedLeading:&nbsp;func()&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;这里清理你的lease
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;klog.Infof(&quot;leader&nbsp;lost:&nbsp;%s&quot;,&nbsp;id)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;os.Exit(0)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;OnNewLeader:&nbsp;func(identity&nbsp;string)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;we're&nbsp;notified&nbsp;when&nbsp;new&nbsp;leader&nbsp;elected
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;identity&nbsp;==&nbsp;id&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;I&nbsp;just&nbsp;got&nbsp;the&nbsp;lock
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;klog.Infof(&quot;new&nbsp;leader&nbsp;elected:&nbsp;%s&quot;,&nbsp;identity)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;},
})
到这里,我们了解了锁的概念和如何启动一个锁,下面看下,client-go 都提供了那些锁。

在代码&nbsp;
tools/leaderelection/resourcelock/interface.go
[6]
&nbsp;定义了一个锁抽象,interface 提供了一个通用接口,用于锁定 leader 选举中使用的资源。
type&nbsp;Interface&nbsp;interface&nbsp;{
&nbsp;//&nbsp;Get&nbsp;返回选举记录
&nbsp;Get(ctx&nbsp;context.Context)&nbsp;(*LeaderElectionRecord,&nbsp;[]byte,&nbsp;error)

&nbsp;//&nbsp;Create&nbsp;创建一个LeaderElectionRecord
&nbsp;Create(ctx&nbsp;context.Context,&nbsp;ler&nbsp;LeaderElectionRecord)&nbsp;error

&nbsp;//&nbsp;Update&nbsp;will&nbsp;update&nbsp;and&nbsp;existing&nbsp;LeaderElectionRecord
&nbsp;Update(ctx&nbsp;context.Context,&nbsp;ler&nbsp;LeaderElectionRecord)&nbsp;error

&nbsp;//&nbsp;RecordEvent&nbsp;is&nbsp;used&nbsp;to&nbsp;record&nbsp;events
&nbsp;RecordEvent(string)

&nbsp;//&nbsp;Identity&nbsp;返回锁的标识
&nbsp;Identity()&nbsp;string

&nbsp;//&nbsp;Describe&nbsp;is&nbsp;used&nbsp;to&nbsp;convert&nbsp;details&nbsp;on&nbsp;current&nbsp;resource&nbsp;lock&nbsp;into&nbsp;a&nbsp;string
&nbsp;Describe()&nbsp;string
}
那么实现这个抽象接口的就是,实现的资源锁,我们可以看到,client-go 提供了四种资源锁
  • leaselock
  • configmaplock
  • multilock
  • endpointlock

leaselock

Lease 是 kubernetes 控制平面中的通过 ETCD 来实现的一个 Leases 的资源,主要为了提供分布式租约的一种控制机制。相关对这个 API 的描述可以参考于:
Lease
[7]
&nbsp;。
在 Kubernetes 集群中,我们可以使用如下命令来查看对应的 lease
$&nbsp;kubectl&nbsp;get&nbsp;leases&nbsp;-A
NAMESPACE&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NAME&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;HOLDER&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AGE
kube-node-lease&nbsp;&nbsp;&nbsp;master-machine&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;master-machine&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3d19h
kube-system&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;kube-controller-manager&nbsp;&nbsp;&nbsp;master-machine_06730140-a503-487d-850b-1fe1619f1fe1&nbsp;&nbsp;&nbsp;3d19h
kube-system&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;kube-scheduler&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;master-machine_1724e2d9-c19c-48d7-ae47-ee4217b27073&nbsp;&nbsp;&nbsp;3d19h

$&nbsp;kubectl&nbsp;describe&nbsp;leases&nbsp;kube-controller-manager&nbsp;-n&nbsp;kube-system
Name:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;kube-controller-manager
Namespace:&nbsp;&nbsp;&nbsp;&nbsp;kube-system
Labels:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<none>
Annotations:&nbsp;&nbsp;<none>
API&nbsp;Version:&nbsp;&nbsp;coordination.k8s.io/v1
Kind:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Lease
Metadata:
&nbsp;&nbsp;Creation&nbsp;Timestamp:&nbsp;&nbsp;2022-06-24T11:01:51Z
&nbsp;&nbsp;Managed&nbsp;Fields:
&nbsp;&nbsp;&nbsp;&nbsp;API&nbsp;Version:&nbsp;&nbsp;coordination.k8s.io/v1
&nbsp;&nbsp;&nbsp;&nbsp;Fields&nbsp;Type:&nbsp;&nbsp;FieldsV1
&nbsp;&nbsp;&nbsp;&nbsp;fieldsV1:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:spec:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:acquireTime:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:holderIdentity:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:leaseDurationSeconds:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:leaseTransitions:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:renewTime:
&nbsp;&nbsp;&nbsp;&nbsp;Manager:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;kube-controller-manager
&nbsp;&nbsp;&nbsp;&nbsp;Operation:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Update
&nbsp;&nbsp;&nbsp;&nbsp;Time:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2022-06-24T11:01:51Z
&nbsp;&nbsp;Resource&nbsp;Version:&nbsp;&nbsp;56012
&nbsp;&nbsp;Self&nbsp;Link:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager
&nbsp;&nbsp;UID:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;851a32d2-25dc-49b6-a3f7-7a76f152f071
Spec:
&nbsp;&nbsp;Acquire&nbsp;Time:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2022-06-27T15:30:46.000000Z
&nbsp;&nbsp;Holder&nbsp;Identity:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;master-machine_06730140-a503-487d-850b-1fe1619f1fe1
&nbsp;&nbsp;Lease&nbsp;Duration&nbsp;Seconds:&nbsp;&nbsp;15
&nbsp;&nbsp;Lease&nbsp;Transitions:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2
&nbsp;&nbsp;Renew&nbsp;Time:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2022-06-28T06:09:26.837773Z
Events:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<none>
下面来看下 leaselock 的实现,leaselock 会实现了作为资源锁的抽象
type&nbsp;LeaseLock&nbsp;struct&nbsp;{
&nbsp;//&nbsp;LeaseMeta&nbsp;就是类似于其他资源类型的属性,包含name&nbsp;ns&nbsp;以及其他关于lease的属性
&nbsp;LeaseMeta&nbsp;&nbsp;metav1.ObjectMeta
&nbsp;Client&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;coordinationv1client.LeasesGetter&nbsp;//&nbsp;Client&nbsp;就是提供了informer中的功能
&nbsp;//&nbsp;lockconfig包含上面通过&nbsp;describe&nbsp;看到的&nbsp;Identity与recoder用于记录资源锁的更改
&nbsp;&nbsp;&nbsp;&nbsp;LockConfig&nbsp;ResourceLockConfig
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;lease&nbsp;就是&nbsp;API中的Lease资源,可以参考下上面给出的这个API的使用
&nbsp;lease&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*coordinationv1.Lease
}
下面来看下 leaselock 实现了那些方法?

Get
Get
[8]
&nbsp;是从 spec 中返回选举的记录
func&nbsp;(ll&nbsp;*LeaseLock)&nbsp;Get(ctx&nbsp;context.Context)&nbsp;(*LeaderElectionRecord,&nbsp;[]byte,&nbsp;error)&nbsp;{
&nbsp;var&nbsp;err&nbsp;error
&nbsp;ll.lease,&nbsp;err&nbsp;=&nbsp;ll.Client.Leases(ll.LeaseMeta.Namespace).Get(ctx,&nbsp;ll.LeaseMeta.Name,&nbsp;metav1.GetOptions{})
&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;nil,&nbsp;err
&nbsp;}
&nbsp;record&nbsp;:=&nbsp;LeaseSpecToLeaderElectionRecord(&ll.lease.Spec)
&nbsp;recordByte,&nbsp;err&nbsp;:=&nbsp;json.Marshal(*record)
&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;nil,&nbsp;err
&nbsp;}
&nbsp;return&nbsp;record,&nbsp;recordByte,&nbsp;nil
}

//&nbsp;可以看出是返回这个资源spec里面填充的值
func&nbsp;LeaseSpecToLeaderElectionRecord(spec&nbsp;*coordinationv1.LeaseSpec)&nbsp;*LeaderElectionRecord&nbsp;{
&nbsp;var&nbsp;r&nbsp;LeaderElectionRecord
&nbsp;if&nbsp;spec.HolderIdentity&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;r.HolderIdentity&nbsp;=&nbsp;*spec.HolderIdentity
&nbsp;}
&nbsp;if&nbsp;spec.LeaseDurationSeconds&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;r.LeaseDurationSeconds&nbsp;=&nbsp;int(*spec.LeaseDurationSeconds)
&nbsp;}
&nbsp;if&nbsp;spec.LeaseTransitions&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;r.LeaderTransitions&nbsp;=&nbsp;int(*spec.LeaseTransitions)
&nbsp;}
&nbsp;if&nbsp;spec.AcquireTime&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;r.AcquireTime&nbsp;=&nbsp;metav1.Time{spec.AcquireTime.Time}
&nbsp;}
&nbsp;if&nbsp;spec.RenewTime&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;r.RenewTime&nbsp;=&nbsp;metav1.Time{spec.RenewTime.Time}
&nbsp;}
&nbsp;return&nbsp;&r
}
Create
Create
[9]
&nbsp;是在 kubernetes 集群中尝试去创建一个租约,可以看到,Client 就是 API 提供的对应资源的 REST 客户端,结果会在 Kubernetes 集群中创建这个 Lease
func&nbsp;(ll&nbsp;*LeaseLock)&nbsp;Create(ctx&nbsp;context.Context,&nbsp;ler&nbsp;LeaderElectionRecord)&nbsp;error&nbsp;{
&nbsp;var&nbsp;err&nbsp;error
&nbsp;ll.lease,&nbsp;err&nbsp;=&nbsp;ll.Client.Leases(ll.LeaseMeta.Namespace).Create(ctx,&nbsp;&coordinationv1.Lease{
&nbsp;&nbsp;ObjectMeta:&nbsp;metav1.ObjectMeta{
&nbsp;&nbsp;&nbsp;Name:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ll.LeaseMeta.Name,
&nbsp;&nbsp;&nbsp;Namespace:&nbsp;ll.LeaseMeta.Namespace,
&nbsp;&nbsp;},
&nbsp;&nbsp;Spec:&nbsp;LeaderElectionRecordToLeaseSpec(&ler),
&nbsp;},&nbsp;metav1.CreateOptions{})
&nbsp;return&nbsp;err
}
Update
Update
[10]
&nbsp;是更新 Lease 的 spec
func&nbsp;(ll&nbsp;*LeaseLock)&nbsp;Update(ctx&nbsp;context.Context,&nbsp;ler&nbsp;LeaderElectionRecord)&nbsp;error&nbsp;{
&nbsp;if&nbsp;ll.lease&nbsp;==&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;errors.New(&quot;lease&nbsp;not&nbsp;initialized,&nbsp;call&nbsp;get&nbsp;or&nbsp;create&nbsp;first&quot;)
&nbsp;}
&nbsp;ll.lease.Spec&nbsp;=&nbsp;LeaderElectionRecordToLeaseSpec(&ler)

&nbsp;lease,&nbsp;err&nbsp;:=&nbsp;ll.Client.Leases(ll.LeaseMeta.Namespace).Update(ctx,&nbsp;ll.lease,&nbsp;metav1.UpdateOptions{})
&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;err
&nbsp;}

&nbsp;ll.lease&nbsp;=&nbsp;lease
&nbsp;return&nbsp;nil
}
RecordEvent
RecordEvent
[11]
&nbsp;是记录选举时出现的事件,这时候我们回到上部分 在 kubernetes 集群中查看 ep 的信息时可以看到的 event 中存在&nbsp;
became leader
&nbsp;的事件,这里就是将产生的这个 event 添加到&nbsp;
meta-data
&nbsp;中。
func&nbsp;(ll&nbsp;*LeaseLock)&nbsp;RecordEvent(s&nbsp;string)&nbsp;{
&nbsp;&nbsp;&nbsp;if&nbsp;ll.LockConfig.EventRecorder&nbsp;==&nbsp;nil&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return
&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;events&nbsp;:=&nbsp;fmt.Sprintf(&quot;%v&nbsp;%v&quot;,&nbsp;ll.LockConfig.Identity,&nbsp;s)
&nbsp;&nbsp;&nbsp;subject&nbsp;:=&nbsp;&coordinationv1.Lease{ObjectMeta:&nbsp;ll.lease.ObjectMeta}
&nbsp;&nbsp;&nbsp;//&nbsp;Populate&nbsp;the&nbsp;type&nbsp;meta,&nbsp;so&nbsp;we&nbsp;don't&nbsp;have&nbsp;to&nbsp;get&nbsp;it&nbsp;from&nbsp;the&nbsp;schema
&nbsp;&nbsp;&nbsp;subject.Kind&nbsp;=&nbsp;&quot;Lease&quot;
&nbsp;&nbsp;&nbsp;subject.APIVersion&nbsp;=&nbsp;coordinationv1.SchemeGroupVersion.String()
&nbsp;&nbsp;&nbsp;ll.LockConfig.EventRecorder.Eventf(subject,&nbsp;corev1.EventTypeNormal,&nbsp;&quot;LeaderElection&quot;,&nbsp;events)
}
到这里大致上了解了资源锁究竟是什么了,其他种类的资源锁也是相同的实现的方式,这里就不过多阐述了;下面的我们来看看选举的过程。

election workflow

选举的代码入口是在&nbsp;
leaderelection.go
[12]
&nbsp;,这里会继续上面的 example 向下分析整个选举的过程。
前面我们看到了进入选举的入口是一个&nbsp;
RunOrDie()
[13]
&nbsp;的函数,那么就继续从这里开始来了解。进入 RunOrDie,看到其实只有几行而已,大致上了解到了 RunOrDie 会使用提供的配置来启动选举的客户端,之后会阻塞,直到 ctx 退出,或停止持有 leader 的租约。
func&nbsp;RunOrDie(ctx&nbsp;context.Context,&nbsp;lec&nbsp;LeaderElectionConfig)&nbsp;{
&nbsp;le,&nbsp;err&nbsp;:=&nbsp;NewLeaderElector(lec)
&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;panic(err)
&nbsp;}
&nbsp;if&nbsp;lec.WatchDog&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;lec.WatchDog.SetLeaderElection(le)
&nbsp;}
&nbsp;le.Run(ctx)
}
下面看下&nbsp;
NewLeaderElector
[14]
&nbsp;做了些什么?可以看到,LeaderElector 是一个结构体,这里只是创建他,这个结构体提供了我们选举中所需要的一切(LeaderElector 就是 RunOrDie 创建的选举客户端)。
func&nbsp;NewLeaderElector(lec&nbsp;LeaderElectionConfig)&nbsp;(*LeaderElector,&nbsp;error)&nbsp;{
&nbsp;if&nbsp;lec.LeaseDuration&nbsp;<=&nbsp;lec.RenewDeadline&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;leaseDuration&nbsp;must&nbsp;be&nbsp;greater&nbsp;than&nbsp;renewDeadline&quot;)
&nbsp;}
&nbsp;if&nbsp;lec.RenewDeadline&nbsp;<=&nbsp;time.Duration(JitterFactor*float64(lec.RetryPeriod))&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;renewDeadline&nbsp;must&nbsp;be&nbsp;greater&nbsp;than&nbsp;retryPeriod*JitterFactor&quot;)
&nbsp;}
&nbsp;if&nbsp;lec.LeaseDuration&nbsp;<&nbsp;1&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;leaseDuration&nbsp;must&nbsp;be&nbsp;greater&nbsp;than&nbsp;zero&quot;)
&nbsp;}
&nbsp;if&nbsp;lec.RenewDeadline&nbsp;<&nbsp;1&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;renewDeadline&nbsp;must&nbsp;be&nbsp;greater&nbsp;than&nbsp;zero&quot;)
&nbsp;}
&nbsp;if&nbsp;lec.RetryPeriod&nbsp;<&nbsp;1&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;retryPeriod&nbsp;must&nbsp;be&nbsp;greater&nbsp;than&nbsp;zero&quot;)
&nbsp;}
&nbsp;if&nbsp;lec.Callbacks.OnStartedLeading&nbsp;==&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;OnStartedLeading&nbsp;callback&nbsp;must&nbsp;not&nbsp;be&nbsp;nil&quot;)
&nbsp;}
&nbsp;if&nbsp;lec.Callbacks.OnStoppedLeading&nbsp;==&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;OnStoppedLeading&nbsp;callback&nbsp;must&nbsp;not&nbsp;be&nbsp;nil&quot;)
&nbsp;}

&nbsp;if&nbsp;lec.Lock&nbsp;==&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;fmt.Errorf(&quot;Lock&nbsp;must&nbsp;not&nbsp;be&nbsp;nil.&quot;)
&nbsp;}
&nbsp;le&nbsp;:=&nbsp;LeaderElector{
&nbsp;&nbsp;config:&nbsp;&nbsp;lec,
&nbsp;&nbsp;clock:&nbsp;&nbsp;&nbsp;clock.RealClock{},
&nbsp;&nbsp;metrics:&nbsp;globalMetricsFactory.newLeaderMetrics(),
&nbsp;}
&nbsp;le.metrics.leaderOff(le.config.Name)
&nbsp;return&nbsp;&le,&nbsp;nil
}
LeaderElector
[15]
&nbsp;是建立的选举客户端,
type&nbsp;LeaderElector&nbsp;struct&nbsp;{
&nbsp;config&nbsp;LeaderElectionConfig&nbsp;//&nbsp;这个的配置,包含一些时间参数,健康检查
&nbsp;//&nbsp;recoder相关属性
&nbsp;observedRecord&nbsp;&nbsp;&nbsp;&nbsp;rl.LeaderElectionRecord
&nbsp;observedRawRecord&nbsp;[]byte
&nbsp;observedTime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;time.Time
&nbsp;//&nbsp;used&nbsp;to&nbsp;implement&nbsp;OnNewLeader(),&nbsp;may&nbsp;lag&nbsp;slightly&nbsp;from&nbsp;the
&nbsp;//&nbsp;value&nbsp;observedRecord.HolderIdentity&nbsp;if&nbsp;the&nbsp;transition&nbsp;has
&nbsp;//&nbsp;not&nbsp;yet&nbsp;been&nbsp;reported.
&nbsp;reportedLeader&nbsp;string
&nbsp;//&nbsp;clock&nbsp;is&nbsp;wrapper&nbsp;around&nbsp;time&nbsp;to&nbsp;allow&nbsp;for&nbsp;less&nbsp;flaky&nbsp;testing
&nbsp;clock&nbsp;clock.Clock
&nbsp;//&nbsp;锁定&nbsp;observedRecord
&nbsp;observedRecordLock&nbsp;sync.Mutex
&nbsp;metrics&nbsp;leaderMetricsAdapter
}
可以看到 Run 实现的选举逻辑就是在初始化客户端时传入的 三个 callback
func&nbsp;(le&nbsp;*LeaderElector)&nbsp;Run(ctx&nbsp;context.Context)&nbsp;{
&nbsp;defer&nbsp;runtime.HandleCrash()
&nbsp;defer&nbsp;func()&nbsp;{&nbsp;//&nbsp;退出时执行callbacke的OnStoppedLeading
&nbsp;&nbsp;le.config.Callbacks.OnStoppedLeading()
&nbsp;}()

&nbsp;if&nbsp;!le.acquire(ctx)&nbsp;{
&nbsp;&nbsp;return
&nbsp;}
&nbsp;ctx,&nbsp;cancel&nbsp;:=&nbsp;context.WithCancel(ctx)
&nbsp;defer&nbsp;cancel()
&nbsp;go&nbsp;le.config.Callbacks.OnStartedLeading(ctx)&nbsp;//&nbsp;选举时,执行&nbsp;OnStartedLeading
&nbsp;le.renew(ctx)
}
在 Run 中调用了 acquire,这个是 通过一个 loop 去调用 tryAcquireOrRenew,直到 ctx 传递过来结束信号
func&nbsp;(le&nbsp;*LeaderElector)&nbsp;acquire(ctx&nbsp;context.Context)&nbsp;bool&nbsp;{
&nbsp;ctx,&nbsp;cancel&nbsp;:=&nbsp;context.WithCancel(ctx)
&nbsp;defer&nbsp;cancel()
&nbsp;succeeded&nbsp;:=&nbsp;false
&nbsp;desc&nbsp;:=&nbsp;le.config.Lock.Describe()
&nbsp;klog.Infof(&quot;attempting&nbsp;to&nbsp;acquire&nbsp;leader&nbsp;lease&nbsp;%v...&quot;,&nbsp;desc)
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;jitterUntil是执行定时的函数&nbsp;func()&nbsp;是定时任务的逻辑
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;RetryPeriod是周期间隔
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;JitterFactor&nbsp;是重试系数,类似于延迟队列中的系数&nbsp;(duration&nbsp;+&nbsp;maxFactor&nbsp;*&nbsp;duration)
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;sliding&nbsp;逻辑是否计算在时间内
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;上下文传递
&nbsp;wait.JitterUntil(func()&nbsp;{
&nbsp;&nbsp;succeeded&nbsp;=&nbsp;le.tryAcquireOrRenew(ctx)
&nbsp;&nbsp;le.maybeReportTransition()
&nbsp;&nbsp;if&nbsp;!succeeded&nbsp;{
&nbsp;&nbsp;&nbsp;klog.V(4).Infof(&quot;failed&nbsp;to&nbsp;acquire&nbsp;lease&nbsp;%v&quot;,&nbsp;desc)
&nbsp;&nbsp;&nbsp;return
&nbsp;&nbsp;}
&nbsp;&nbsp;le.config.Lock.RecordEvent(&quot;became&nbsp;leader&quot;)
&nbsp;&nbsp;le.metrics.leaderOn(le.config.Name)
&nbsp;&nbsp;klog.Infof(&quot;successfully&nbsp;acquired&nbsp;lease&nbsp;%v&quot;,&nbsp;desc)
&nbsp;&nbsp;cancel()
&nbsp;},&nbsp;le.config.RetryPeriod,&nbsp;JitterFactor,&nbsp;true,&nbsp;ctx.Done())
&nbsp;return&nbsp;succeeded
}
这里实际上选举动作在 tryAcquireOrRenew 中,下面来看下 tryAcquireOrRenew;tryAcquireOrRenew 是尝试获得一个 leader 租约,如果已经获得到了,则更新租约;否则可以得到租约则为 true,反之 false
func&nbsp;(le&nbsp;*LeaderElector)&nbsp;tryAcquireOrRenew(ctx&nbsp;context.Context)&nbsp;bool&nbsp;{
&nbsp;now&nbsp;:=&nbsp;metav1.Now()&nbsp;//&nbsp;时间
&nbsp;leaderElectionRecord&nbsp;:=&nbsp;rl.LeaderElectionRecord{&nbsp;//&nbsp;构建一个选举record
&nbsp;&nbsp;HolderIdentity:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;le.config.Lock.Identity(),&nbsp;//&nbsp;选举人的身份特征,ep与主机名有关
&nbsp;&nbsp;LeaseDurationSeconds:&nbsp;int(le.config.LeaseDuration&nbsp;/&nbsp;time.Second),&nbsp;//&nbsp;默认15s
&nbsp;&nbsp;RenewTime:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;now,&nbsp;//&nbsp;重新获取时间
&nbsp;&nbsp;AcquireTime:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;now,&nbsp;//&nbsp;获得时间
&nbsp;}

&nbsp;//&nbsp;1.&nbsp;从API获取或创建一个recode,如果可以拿到则已经有租约,反之创建新租约
&nbsp;oldLeaderElectionRecord,&nbsp;oldLeaderElectionRawRecord,&nbsp;err&nbsp;:=&nbsp;le.config.Lock.Get(ctx)
&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;if&nbsp;!errors.IsNotFound(err)&nbsp;{
&nbsp;&nbsp;&nbsp;klog.Errorf(&quot;error&nbsp;retrieving&nbsp;resource&nbsp;lock&nbsp;%v:&nbsp;%v&quot;,&nbsp;le.config.Lock.Describe(),&nbsp;err)
&nbsp;&nbsp;&nbsp;return&nbsp;false
&nbsp;&nbsp;}
&nbsp;&nbsp;//&nbsp;创建租约的动作就是新建一个对应的resource,这个lock就是leaderelection提供的四种锁,
&nbsp;&nbsp;//&nbsp;看你在runOrDie中初始化传入了什么锁
&nbsp;&nbsp;if&nbsp;err&nbsp;=&nbsp;le.config.Lock.Create(ctx,&nbsp;leaderElectionRecord);&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;&nbsp;klog.Errorf(&quot;error&nbsp;initially&nbsp;creating&nbsp;leader&nbsp;election&nbsp;record:&nbsp;%v&quot;,&nbsp;err)
&nbsp;&nbsp;&nbsp;return&nbsp;false
&nbsp;&nbsp;}
&nbsp;&nbsp;//&nbsp;到了这里就已经拿到或者创建了租约,然后记录其一些属性,LeaderElectionRecord
&nbsp;&nbsp;le.setObservedRecord(&leaderElectionRecord)

&nbsp;&nbsp;return&nbsp;true
&nbsp;}

&nbsp;//&nbsp;2.&nbsp;获取记录检查身份和时间
&nbsp;if&nbsp;!bytes.Equal(le.observedRawRecord,&nbsp;oldLeaderElectionRawRecord)&nbsp;{
&nbsp;&nbsp;le.setObservedRecord(oldLeaderElectionRecord)

&nbsp;&nbsp;le.observedRawRecord&nbsp;=&nbsp;oldLeaderElectionRawRecord
&nbsp;}
&nbsp;if&nbsp;len(oldLeaderElectionRecord.HolderIdentity)&nbsp;>&nbsp;0&nbsp;&&
&nbsp;&nbsp;le.observedTime.Add(le.config.LeaseDuration).After(now.Time)&nbsp;&&
&nbsp;&nbsp;!le.IsLeader()&nbsp;{&nbsp;//&nbsp;不是leader,进行HolderIdentity比较,再加上时间,这个时候没有到竞选其,跳出
&nbsp;&nbsp;klog.V(4).Infof(&quot;lock&nbsp;is&nbsp;held&nbsp;by&nbsp;%v&nbsp;and&nbsp;has&nbsp;not&nbsp;yet&nbsp;expired&quot;,&nbsp;oldLeaderElectionRecord.HolderIdentity)
&nbsp;&nbsp;return&nbsp;false
&nbsp;}

&nbsp;// 3.我们将尝试更新。&nbsp;在这里leaderElectionRecord设置为默认值。让我们在更新之前更正它。
&nbsp;if&nbsp;le.IsLeader()&nbsp;{&nbsp;//&nbsp;到这就说明是leader,修正他的时间
&nbsp;&nbsp;leaderElectionRecord.AcquireTime&nbsp;=&nbsp;oldLeaderElectionRecord.AcquireTime
&nbsp;&nbsp;leaderElectionRecord.LeaderTransitions&nbsp;=&nbsp;oldLeaderElectionRecord.LeaderTransitions
&nbsp;}&nbsp;else&nbsp;{&nbsp;//&nbsp;LeaderTransitions&nbsp;就是指leader调整(转变为其他)了几次,如果是,
&nbsp;&nbsp;//&nbsp;则为发生转变,保持原有值
&nbsp;&nbsp;//&nbsp;反之,则+1
&nbsp;&nbsp;leaderElectionRecord.LeaderTransitions&nbsp;=&nbsp;oldLeaderElectionRecord.LeaderTransitions&nbsp;+&nbsp;1
&nbsp;}
&nbsp;//&nbsp;完事之后更新APIServer中的锁资源,也就是更新对应的资源的属性信息
&nbsp;if&nbsp;err&nbsp;=&nbsp;le.config.Lock.Update(ctx,&nbsp;leaderElectionRecord);&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;klog.Errorf(&quot;Failed&nbsp;to&nbsp;update&nbsp;lock:&nbsp;%v&quot;,&nbsp;err)
&nbsp;&nbsp;return&nbsp;false
&nbsp;}
&nbsp;//&nbsp;setObservedRecord&nbsp;是通过一个新的record来更新这个锁中的record
&nbsp;//&nbsp;操作是安全的,会上锁保证临界区仅可以被一个线程/进程操作
&nbsp;le.setObservedRecord(&leaderElectionRecord)
&nbsp;return&nbsp;true
}
到这里,已经完整知道利用 kubernetes 进行选举的流程都是什么了;下面简单回顾下,上述 leader 选举所有的步骤:

  • 首选创建的服务就是该服务的 leader,锁可以为&nbsp;
    lease
    &nbsp;,&nbsp;
    endpoint
    &nbsp;等资源进行上锁
  • 已经是 leader 的实例会不断续租,租约的默认值是 15 秒 (
    leaseDuration
    );leader 在租约满时更新租约时间(
    renewTime
    )。
  • 其他的 follower,会不断检查对应资源锁的存在,如果已经有 leader,那么则检查&nbsp;
    renewTime
    ,如果超过了租用时间(),则表明 leader 存在问题需要重新启动选举,直到有 follower 提升为 leader。
  • 而为了避免资源被抢占,Kubernetes API 使用了&nbsp;
    ResourceVersion
    &nbsp;来避免被重复修改(如果版本号与请求版本号不一致,则表示已经被修改了,那么 APIServer 将返回错误)

利用 Leader 机制实现 HA 应用

下面就通过一个 example 来实现一个,利用 kubernetes 提供的选举机制完成的高可用应用。

代码实现

如果仅仅是使用 Kubernetes 中的锁,实现的代码也只有几行而已。
package&nbsp;main

import&nbsp;(
&nbsp;&quot;context&quot;
&nbsp;&quot;flag&quot;
&nbsp;&quot;fmt&quot;
&nbsp;&quot;os&quot;
&nbsp;&quot;os/signal&quot;
&nbsp;&quot;syscall&quot;
&nbsp;&quot;time&quot;

&nbsp;metav1&nbsp;&quot;k8s.io/apimachinery/pkg/apis/meta/v1&quot;
&nbsp;clientset&nbsp;&quot;k8s.io/client-go/kubernetes&quot;
&nbsp;&quot;k8s.io/client-go/rest&quot;
&nbsp;&quot;k8s.io/client-go/tools/clientcmd&quot;
&nbsp;&quot;k8s.io/client-go/tools/leaderelection&quot;
&nbsp;&quot;k8s.io/client-go/tools/leaderelection/resourcelock&quot;
&nbsp;&quot;k8s.io/klog/v2&quot;
)

func&nbsp;buildConfig(kubeconfig&nbsp;string)&nbsp;(*rest.Config,&nbsp;error)&nbsp;{
&nbsp;if&nbsp;kubeconfig&nbsp;!=&nbsp;&quot;&quot;&nbsp;{
&nbsp;&nbsp;cfg,&nbsp;err&nbsp;:=&nbsp;clientcmd.BuildConfigFromFlags(&quot;&quot;,&nbsp;kubeconfig)
&nbsp;&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;&nbsp;return&nbsp;nil,&nbsp;err
&nbsp;&nbsp;}
&nbsp;&nbsp;return&nbsp;cfg,&nbsp;nil
&nbsp;}

&nbsp;cfg,&nbsp;err&nbsp;:=&nbsp;rest.InClusterConfig()
&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;return&nbsp;nil,&nbsp;err
&nbsp;}
&nbsp;return&nbsp;cfg,&nbsp;nil
}

func&nbsp;main()&nbsp;{
&nbsp;klog.InitFlags(nil)

&nbsp;var&nbsp;kubeconfig&nbsp;string
&nbsp;var&nbsp;leaseLockName&nbsp;string
&nbsp;var&nbsp;leaseLockNamespace&nbsp;string
&nbsp;var&nbsp;id&nbsp;string
&nbsp;//&nbsp;初始化客户端的部分
&nbsp;flag.StringVar(&kubeconfig,&nbsp;&quot;kubeconfig&quot;,&nbsp;&quot;&quot;,&nbsp;&quot;absolute&nbsp;path&nbsp;to&nbsp;the&nbsp;kubeconfig&nbsp;file&quot;)
&nbsp;flag.StringVar(&id,&nbsp;&quot;id&quot;,&nbsp;&quot;&quot;,&nbsp;&quot;the&nbsp;holder&nbsp;identity&nbsp;name&quot;)
&nbsp;flag.StringVar(&leaseLockName,&nbsp;&quot;lease-lock-name&quot;,&nbsp;&quot;&quot;,&nbsp;&quot;the&nbsp;lease&nbsp;lock&nbsp;resource&nbsp;name&quot;)
&nbsp;flag.StringVar(&leaseLockNamespace,&nbsp;&quot;lease-lock-namespace&quot;,&nbsp;&quot;&quot;,&nbsp;&quot;the&nbsp;lease&nbsp;lock&nbsp;resource&nbsp;namespace&quot;)
&nbsp;flag.Parse()

&nbsp;if&nbsp;leaseLockName&nbsp;==&nbsp;&quot;&quot;&nbsp;{
&nbsp;&nbsp;klog.Fatal(&quot;unable&nbsp;to&nbsp;get&nbsp;lease&nbsp;lock&nbsp;resource&nbsp;name&nbsp;(missing&nbsp;lease-lock-name&nbsp;flag).&quot;)
&nbsp;}
&nbsp;if&nbsp;leaseLockNamespace&nbsp;==&nbsp;&quot;&quot;&nbsp;{
&nbsp;&nbsp;klog.Fatal(&quot;unable&nbsp;to&nbsp;get&nbsp;lease&nbsp;lock&nbsp;resource&nbsp;namespace&nbsp;(missing&nbsp;lease-lock-namespace&nbsp;flag).&quot;)
&nbsp;}
&nbsp;config,&nbsp;err&nbsp;:=&nbsp;buildConfig(kubeconfig)
&nbsp;if&nbsp;err&nbsp;!=&nbsp;nil&nbsp;{
&nbsp;&nbsp;klog.Fatal(err)
&nbsp;}
&nbsp;client&nbsp;:=&nbsp;clientset.NewForConfigOrDie(config)

&nbsp;run&nbsp;:=&nbsp;func(ctx&nbsp;context.Context)&nbsp;{
&nbsp;&nbsp;//&nbsp;实现的业务逻辑,这里仅仅为实验,就直接打印了
&nbsp;&nbsp;klog.Info(&quot;Controller&nbsp;loop...&quot;)

&nbsp;&nbsp;for&nbsp;{
&nbsp;&nbsp;&nbsp;fmt.Println(&quot;I&nbsp;am&nbsp;leader,&nbsp;I&nbsp;was&nbsp;working.&quot;)
&nbsp;&nbsp;&nbsp;time.Sleep(time.Second&nbsp;*&nbsp;5)
&nbsp;&nbsp;}
&nbsp;}

&nbsp;//&nbsp;use&nbsp;a&nbsp;Go&nbsp;context&nbsp;so&nbsp;we&nbsp;can&nbsp;tell&nbsp;the&nbsp;leaderelection&nbsp;code&nbsp;when&nbsp;we
&nbsp;//&nbsp;want&nbsp;to&nbsp;step&nbsp;down
&nbsp;ctx,&nbsp;cancel&nbsp;:=&nbsp;context.WithCancel(context.Background())
&nbsp;defer&nbsp;cancel()

&nbsp;//&nbsp;监听系统中断
&nbsp;ch&nbsp;:=&nbsp;make(chan&nbsp;os.Signal,&nbsp;1)
&nbsp;signal.Notify(ch,&nbsp;os.Interrupt,&nbsp;syscall.SIGTERM)
&nbsp;go&nbsp;func()&nbsp;{
&nbsp;&nbsp;<-ch
&nbsp;&nbsp;klog.Info(&quot;Received&nbsp;termination,&nbsp;signaling&nbsp;shutdown&quot;)
&nbsp;&nbsp;cancel()
&nbsp;}()

&nbsp;//&nbsp;创建一个资源锁
&nbsp;lock&nbsp;:=&nbsp;&resourcelock.LeaseLock{
&nbsp;&nbsp;LeaseMeta:&nbsp;metav1.ObjectMeta{
&nbsp;&nbsp;&nbsp;Name:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaseLockName,
&nbsp;&nbsp;&nbsp;Namespace:&nbsp;leaseLockNamespace,
&nbsp;&nbsp;},
&nbsp;&nbsp;Client:&nbsp;client.CoordinationV1(),
&nbsp;&nbsp;LockConfig:&nbsp;resourcelock.ResourceLockConfig{
&nbsp;&nbsp;&nbsp;Identity:&nbsp;id,
&nbsp;&nbsp;},
&nbsp;}

&nbsp;//&nbsp;开启一个选举的循环
&nbsp;leaderelection.RunOrDie(ctx,&nbsp;leaderelection.LeaderElectionConfig{
&nbsp;&nbsp;Lock:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lock,
&nbsp;&nbsp;ReleaseOnCancel:&nbsp;true,
&nbsp;&nbsp;LeaseDuration:&nbsp;&nbsp;&nbsp;60&nbsp;*&nbsp;time.Second,
&nbsp;&nbsp;RenewDeadline:&nbsp;&nbsp;&nbsp;15&nbsp;*&nbsp;time.Second,
&nbsp;&nbsp;RetryPeriod:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5&nbsp;*&nbsp;time.Second,
&nbsp;&nbsp;Callbacks:&nbsp;leaderelection.LeaderCallbacks{
&nbsp;&nbsp;&nbsp;OnStartedLeading:&nbsp;func(ctx&nbsp;context.Context)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;当选举为leader后所运行的业务逻辑
&nbsp;&nbsp;&nbsp;&nbsp;run(ctx)
&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;OnStoppedLeading:&nbsp;func()&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;we&nbsp;can&nbsp;do&nbsp;cleanup&nbsp;here
&nbsp;&nbsp;&nbsp;&nbsp;klog.Infof(&quot;leader&nbsp;lost:&nbsp;%s&quot;,&nbsp;id)
&nbsp;&nbsp;&nbsp;&nbsp;os.Exit(0)
&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;OnNewLeader:&nbsp;func(identity&nbsp;string)&nbsp;{&nbsp;//&nbsp;申请一个选举时的动作
&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;identity&nbsp;==&nbsp;id&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return
&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;klog.Infof(&quot;new&nbsp;leader&nbsp;elected:&nbsp;%s&quot;,&nbsp;identity)
&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;},
&nbsp;})
}
*
注:这种 lease 锁只能在 in-cluster 模式下运行,如果需要类似二进制部署的程序,可以选择 endpoint 类型的资源锁。

生成镜像


这里已经制作好了镜像并上传到 dockerhub(
cylonchau/leaderelection:v0.0.2
)上了,如果只要学习运行原理,则忽略此步骤
FROM&nbsp;golang:alpine&nbsp;AS&nbsp;builder
MAINTAINER&nbsp;cylon
WORKDIR&nbsp;/election
COPY&nbsp;.&nbsp;/election
ENV&nbsp;GOPROXY&nbsp;https://goproxy.cn,direct
RUN&nbsp;GOOS=linux&nbsp;GOARCH=amd64&nbsp;CGO_ENABLED=0&nbsp;go&nbsp;build&nbsp;-o&nbsp;elector&nbsp;main.go

FROM&nbsp;alpine&nbsp;AS&nbsp;runner
WORKDIR&nbsp;/go/elector
COPY&nbsp;--from=builder&nbsp;/election/elector&nbsp;.
VOLUME&nbsp;[&quot;/election&quot;]
ENTRYPOINT&nbsp;[&quot;./elector&quot;]

准备资源清单


默认情况下,Kubernetes 运行的 pod 在请求 Kubernetes 集群内资源时,默认的账户是没有权限的,默认服务帐户无权访问协调 &nbsp;API,因此我们需要创建另一个 serviceaccount 并相应地设置 &nbsp;对应的 RBAC 权限绑定;在清单中配置上这个 sa,此时所有的 pod 就会有协调锁的权限了。

apiVersion: v1
kind: ServiceAccount
metadata:
 name: sa-leaderelection
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
 name: leaderelection
rules:
 - apiGroups:
 - coordination.k8s.io
 resources:
 - leases
 verbs:
 - '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
 name: leaderelection
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: Role
 name: leaderelection
subjects:
 - kind: ServiceAccount
 name: sa-leaderelection
---
apiVersion: apps/v1
kind: Deployment
metadata:
 labels:
 app: leaderelection
 name: leaderelection
 namespace: default
spec:
 replicas: 3
 selector:
 matchLabels:
 app: leaderelection
 template:
 metadata:
 labels:
 app: leaderelection
 spec:
 containers:
 - image: cylonchau/leaderelection:v0.0.2
 imagePullPolicy: IfNotPresent
 command: [&quot;./elector&quot;]
 args:
 - &quot;-id=$(POD_NAME)&quot;
 - &quot;-lease-lock-name=test&quot;
 - &quot;-lease-lock-namespace=default&quot;
 env:
 - name: POD_NAME
 valueFrom:
 fieldRef:
 apiVersion: v1
 fieldPath: metadata.name
 name: elector
 serviceAccountName: sa-leaderelection


集群中运行

执行完清单后,当 pod 启动后,可以看到会创建出一个 lease。

$&nbsp;kubectl&nbsp;get&nbsp;lease
NAME&nbsp;&nbsp;&nbsp;HOLDER&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AGE
test&nbsp;&nbsp;&nbsp;leaderelection-5644c5f84f-frs5n&nbsp;&nbsp;&nbsp;1s


$&nbsp;kubectl&nbsp;describe&nbsp;lease
Name:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;test
Namespace:&nbsp;&nbsp;&nbsp;&nbsp;default
Labels:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<none>
Annotations:&nbsp;&nbsp;<none>
API&nbsp;Version:&nbsp;&nbsp;coordination.k8s.io/v1
Kind:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Lease
Metadata:
&nbsp;&nbsp;Creation&nbsp;Timestamp:&nbsp;&nbsp;2022-06-28T16:39:45Z
&nbsp;&nbsp;Managed&nbsp;Fields:
&nbsp;&nbsp;&nbsp;&nbsp;API&nbsp;Version:&nbsp;&nbsp;coordination.k8s.io/v1
&nbsp;&nbsp;&nbsp;&nbsp;Fields&nbsp;Type:&nbsp;&nbsp;FieldsV1
&nbsp;&nbsp;&nbsp;&nbsp;fieldsV1:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:spec:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:acquireTime:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:holderIdentity:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:leaseDurationSeconds:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:leaseTransitions:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f:renewTime:
&nbsp;&nbsp;&nbsp;&nbsp;Manager:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;elector
&nbsp;&nbsp;&nbsp;&nbsp;Operation:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Update
&nbsp;&nbsp;&nbsp;&nbsp;Time:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2022-06-28T16:39:45Z
&nbsp;&nbsp;Resource&nbsp;Version:&nbsp;&nbsp;131693
&nbsp;&nbsp;Self&nbsp;Link:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;/apis/coordination.k8s.io/v1/namespaces/default/leases/test
&nbsp;&nbsp;UID:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;bef2b164-a117-44bd-bad3-3e651c94c97b
Spec:
&nbsp;&nbsp;Acquire&nbsp;Time:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2022-06-28T16:39:45.931873Z
&nbsp;&nbsp;Holder&nbsp;Identity:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;leaderelection-5644c5f84f-frs5n
&nbsp;&nbsp;Lease&nbsp;Duration&nbsp;Seconds:&nbsp;&nbsp;60
&nbsp;&nbsp;Lease&nbsp;Transitions:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0
&nbsp;&nbsp;Renew&nbsp;Time:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2022-06-28T16:39:55.963537Z
Events:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<none>
通过其持有者的信息查看对应 pod(因为程序中对 holder Identity 设置的是 pod 的名称),实际上是工作的 pod。
如上实例所述,这是利用 Kubernetes 集群完成的 leader 选举的方案,虽然这不是最完美解决方案,但这是一种简单的方法,因为可以无需在集群上部署更多东西或者进行大量的代码工作就可以利用 Kubernetes 集群来实现一个高可用的 HA 应用。
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://xie.infoq.cn/article/30688412fea8408646f47b0ee