はじめに

Kubernetes 1.26 にアップグレードする前に、VPC CNI プラグインのバージョンを 1.12 以降にアップグレードする必要があります。以前のバージョンの VPC CNI は、Kubernetes v1.26 から削除された CRI v1alpha2API に依存しているため、CNI がクラッシュする原因になります。

本記事では、EKS クラスタ自体のアップグレードに関する手順は省略させていただきます。

 

Amazon VPC CNI plugin とは

Amazon VPC CNI plugin for Kubernetes アドオンは Amazon EKS クラスター内の各 Amazon EC2 ノードにデプロイされます。アドオンは Elastic Network Interface を作成し、Amazon EC2 ノードにアタッチします。またアドオンは、VPC のプライベートIPv4 または IPv6 アドレスを各 Pod およびサービスに割り当てます。

 

アドオンの利用状況確認

クラスターにインストールされているアドオンのバージョンを確認します。

$ kubectl describe daemonset aws-node -n kube-system | grep amazon-k8s-cni:
Image: 602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/amazon-k8s-cni:v1.7.5

クラスターに Amazon EKS タイプのアドオンがインストールされていないことを確認しました。

$ aws eks describe-addon --cluster-name **** --addon-name vpc-cni --query addon.addonVersion --output text
An error occurred (ResourceNotFoundException) when calling the DescribeAddon operation: No addon: vpc-cni found in cluster: ****

クラスターの状態を確認します。

$ kubectl describe daemonset.apps/aws-node -n kube-system
Name:           aws-node
Selector:       k8s-app=aws-node
Node-Selector:  
Labels:         k8s-app=aws-node
Annotations:    deprecated.daemonset.template.generation: 3
Desired Number of Nodes Scheduled: 6
Current Number of Nodes Scheduled: 6
Number of Nodes Scheduled with Up-to-date Pods: 6
Number of Nodes Scheduled with Available Pods: 6
Number of Nodes Misscheduled: 0
Pods Status:  6 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels:           k8s-app=aws-node
Service Account:  aws-node
Init Containers:
aws-vpc-cni-init:
Image:      602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/amazon-k8s-cni-init:v1.7.5
Port:       
Host Port:  
Environment:
DISABLE_TCP_EARLY_DEMUX:  false
Mounts:
/host/opt/cni/bin from cni-bin-dir (rw)
Containers:
aws-node:
Image:      602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/amazon-k8s-cni:v1.7.5
Port:       61678/TCP
Host Port:  61678/TCP
Requests:
cpu:      10m
Liveness:   exec [/app/grpc-health-probe -addr=:50051] delay=60s timeout=1s period=10s #success=1 #failure=3
Readiness:  exec [/app/grpc-health-probe -addr=:50051] delay=1s timeout=1s period=10s #success=1 #failure=3
Environment:
ADDITIONAL_ENI_TAGS:                 {}
AWS_VPC_CNI_NODE_PORT_SUPPORT:       true
AWS_VPC_ENI_MTU:                     9001
AWS_VPC_K8S_CNI_CONFIGURE_RPFILTER:  false
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG:  false
AWS_VPC_K8S_CNI_EXTERNALSNAT:        false
AWS_VPC_K8S_CNI_LOGLEVEL:            DEBUG
AWS_VPC_K8S_CNI_LOG_FILE:            /host/var/log/aws-routed-eni/ipamd.log
AWS_VPC_K8S_CNI_RANDOMIZESNAT:       prng
AWS_VPC_K8S_CNI_VETHPREFIX:          eni
AWS_VPC_K8S_PLUGIN_LOG_FILE:         /var/log/aws-routed-eni/plugin.log
AWS_VPC_K8S_PLUGIN_LOG_LEVEL:        DEBUG
DISABLE_INTROSPECTION:               false
DISABLE_METRICS:                     false
ENABLE_POD_ENI:                      false
MY_NODE_NAME:                         (v1:spec.nodeName)
WARM_ENI_TARGET:                     1
Mounts:
/host/etc/cni/net.d from cni-net-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/host/var/log/aws-routed-eni from log-dir (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/aws-node from run-dir (rw)
/var/run/dockershim.sock from dockershim (rw)
Volumes:
cni-bin-dir:
Type:          HostPath (bare host directory volume)
Path:          /opt/cni/bin
HostPathType:
cni-net-dir:
Type:          HostPath (bare host directory volume)
Path:          /etc/cni/net.d
HostPathType:
dockershim:
Type:          HostPath (bare host directory volume)
Path:          /var/run/dockershim.sock
HostPathType:
xtables-lock:
Type:          HostPath (bare host directory volume)
Path:          /run/xtables.lock
HostPathType:
log-dir:
Type:          HostPath (bare host directory volume)
Path:          /var/log/aws-routed-eni
HostPathType:  DirectoryOrCreate
run-dir:
Type:               HostPath (bare host directory volume)
Path:               /var/run/aws-node
HostPathType:       DirectoryOrCreate
Priority Class Name:  system-node-critical
Events:                 

念の為、現在インストールされているアドオンの設定を保存します。

$ kubectl get daemonset aws-node -n kube-system -o yaml > aws-k8s-cni-old.yaml

アップデート用の設定ファイルの取得

$ curl -o aws-k8s-cni_v1.16.3.yaml https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.16.3/config/master/aws-k8s-cni.yaml
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Dload  Upload   Total   Spent    Left  Speed
100 20218  100 20218    0     0  32319      0 --:--:-- --:--:-- --:--:-- 32348

CNI バージョンを最新バージョンに更新

$ kubectl apply -f aws-k8s-cni_v1.16.3.yaml
customresourcedefinition.apiextensions.k8s.io/eniconfigs.crd.k8s.amazonaws.com configured
customresourcedefinition.apiextensions.k8s.io/policyendpoints.networking.k8s.aws unchanged
serviceaccount/aws-node configured
configmap/amazon-vpc-cni created
clusterrole.rbac.authorization.k8s.io/aws-node configured
clusterrolebinding.rbac.authorization.k8s.io/aws-node configured
daemonset.apps/aws-node configured

アドオンの新バージョンの確認

$ kubectl describe daemonset aws-node -n kube-system | grep amazon-k8s-cni:
Image:      602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.16.3

クラスターの状態の確認:

$ kubectl describe daemonset.apps/aws-node -n kube-system
Name:           aws-node
Selector:       k8s-app=aws-node
Node-Selector:
Labels:         app.kubernetes.io/instance=aws-vpc-cni
app.kubernetes.io/name=aws-node
app.kubernetes.io/version=v1.16.3
k8s-app=aws-node
Annotations:    deprecated.daemonset.template.generation: 4
Desired Number of Nodes Scheduled: 6
Current Number of Nodes Scheduled: 6
Number of Nodes Scheduled with Up-to-date Pods: 6
Number of Nodes Scheduled with Available Pods: 6
Number of Nodes Misscheduled: 0
Pods Status:  6 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels:           app.kubernetes.io/instance=aws-vpc-cni
app.kubernetes.io/name=aws-node
k8s-app=aws-node
Service Account:  aws-node
Init Containers:
aws-vpc-cni-init:
Image:      602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.16.3
Port:
Host Port:
Requests:
cpu:  25m
Environment:
DISABLE_TCP_EARLY_DEMUX:  false
ENABLE_IPv6:              false
Mounts:
/host/opt/cni/bin from cni-bin-dir (rw)
Containers:
aws-node:
Image:      602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.16.3
Port:       61678/TCP
Host Port:  61678/TCP
Requests:
cpu:      25m
Liveness:   exec [/app/grpc-health-probe -addr=:50051 -connect-timeout=5s -rpc-timeout=5s] delay=60s timeout=10s period=10s #success=1 #failure=3
Readiness:  exec [/app/grpc-health-probe -addr=:50051 -connect-timeout=5s -rpc-timeout=5s] delay=1s timeout=10s period=10s #success=1 #failure=3
Environment:
ADDITIONAL_ENI_TAGS:                    {}
AWS_VPC_CNI_NODE_PORT_SUPPORT:          true
AWS_VPC_ENI_MTU:                        9001
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG:     false
AWS_VPC_K8S_CNI_EXTERNALSNAT:           false
AWS_VPC_K8S_CNI_LOGLEVEL:               DEBUG
AWS_VPC_K8S_CNI_LOG_FILE:               /host/var/log/aws-routed-eni/ipamd.log
AWS_VPC_K8S_CNI_RANDOMIZESNAT:          prng
AWS_VPC_K8S_CNI_VETHPREFIX:             eni
AWS_VPC_K8S_PLUGIN_LOG_FILE:            /var/log/aws-routed-eni/plugin.log
AWS_VPC_K8S_PLUGIN_LOG_LEVEL:           DEBUG
DISABLE_INTROSPECTION:                  false
DISABLE_METRICS:                        false
DISABLE_NETWORK_RESOURCE_PROVISIONING:  false
ENABLE_IPv4:                            true
ENABLE_IPv6:                            false
ENABLE_POD_ENI:                         false
ENABLE_PREFIX_DELEGATION:               false
VPC_CNI_VERSION:                        v1.16.3
WARM_ENI_TARGET:                        1
WARM_PREFIX_TARGET:                     1
MY_NODE_NAME:                            (v1:spec.nodeName)
MY_POD_NAME:                             (v1:metadata.name)
Mounts:
/host/etc/cni/net.d from cni-net-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/host/var/log/aws-routed-eni from log-dir (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/aws-node from run-dir (rw)
aws-eks-nodeagent:
Image:      602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-network-policy-agent:v1.0.8
Port:
Host Port:
Args:
--enable-ipv6=false
--enable-network-policy=false
--enable-cloudwatch-logs=false
--enable-policy-event-logs=false
--metrics-bind-addr=:8162
--health-probe-bind-addr=:8163
--conntrack-cache-cleanup-period=300
Requests:
cpu:  25m
Environment:
MY_NODE_NAME:   (v1:spec.nodeName)
Mounts:
/host/opt/cni/bin from cni-bin-dir (rw)
/sys/fs/bpf from bpf-pin-path (rw)
/var/log/aws-routed-eni from log-dir (rw)
/var/run/aws-node from run-dir (rw)
Volumes:
bpf-pin-path:
Type:          HostPath (bare host directory volume)
Path:          /sys/fs/bpf
HostPathType:
cni-bin-dir:
Type:          HostPath (bare host directory volume)
Path:          /opt/cni/bin
HostPathType:
cni-net-dir:
Type:          HostPath (bare host directory volume)
Path:          /etc/cni/net.d
HostPathType:
log-dir:
Type:          HostPath (bare host directory volume)
Path:          /var/log/aws-routed-eni
HostPathType:  DirectoryOrCreate
run-dir:
Type:          HostPath (bare host directory volume)
Path:          /var/run/aws-node
HostPathType:  DirectoryOrCreate
xtables-lock:
Type:               HostPath (bare host directory volume)
Path:               /run/xtables.lock
HostPathType:
Priority Class Name:  system-node-critical
Events:
Type    Reason            Age    From                  Message
----    ------            ----   ----                  -------
Normal  SuccessfulDelete  6m24s  daemonset-controller  Deleted pod: aws-node-8822m
Normal  SuccessfulCreate  6m13s  daemonset-controller  Created pod: aws-node-m2x45
Normal  SuccessfulDelete  5m11s  daemonset-controller  Deleted pod: aws-node-s7zk2
Normal  SuccessfulCreate  5m     daemonset-controller  Created pod: aws-node-fl5lq
Normal  SuccessfulDelete  4m24s  daemonset-controller  Deleted pod: aws-node-vzg4g
Normal  SuccessfulCreate  4m14s  daemonset-controller  Created pod: aws-node-9gfnp
Normal  SuccessfulDelete  3m26s  daemonset-controller  Deleted pod: aws-node-bqkmx
Normal  SuccessfulCreate  3m15s  daemonset-controller  Created pod: aws-node-k746c
Normal  SuccessfulDelete  2m19s  daemonset-controller  Deleted pod: aws-node-56k7c
Normal  SuccessfulCreate  2m8s   daemonset-controller  Created pod: aws-node-7jzr5
Normal  SuccessfulDelete  71s    daemonset-controller  Deleted pod: aws-node-b8zmw
Normal  SuccessfulCreate  60s    daemonset-controller  Created pod: aws-node-b4plz