はじめに
k8sを取り扱っていて遭遇したエラーに時間をとられたので、備忘の意味も込めてここに記します。(※グダグダな記事になってしまったので、自称が再発したら修正していきます)
事象とあがきの記録
master nodeにて
※以下の内容はmaster nodeにて作業している内容です。
「sudo kubeadm init –pod-network-cidr=10.244.0.0/16」を実行したときに以下のエラーが発生
Please define which one do you wish to use by setting the 'criSocket' field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/crio/crio.sock
Please define which one do you wish to use by setting the 'criSocket' field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/crio/crio.sock
これに対して
$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///var/run/crio/crio.sock
で対応。
試しに、get nodeをしてみるが、できない。
$ kubectl get node
E0323 15:59:04.013961 58845 memcache.go:265] couldn't get current server API group list: Get "https://192.168.40.146:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
E0323 15:59:04.025201 58845 memcache.go:265] couldn't get current server API group list: Get "https://192.168.40.146:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
E0323 15:59:04.036053 58845 memcache.go:265] couldn't get current server API group list: Get "https://192.168.40.146:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
E0323 15:59:04.046707 58845 memcache.go:265] couldn't get current server API group list: Get "https://192.168.40.146:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
E0323 15:59:04.057280 58845 memcache.go:265] couldn't get current server API group list: Get "https://192.168.40.146:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
エラーは以下のような内容。
E0323 15:59:04.013961 58845 memcache.go:265] couldn't get current server API group list: Get "https://192.168.40.146:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
「-v=10」を付けてみると、色々と出て、最後にcontrol-planeが動いていることは確認できた。
$ kubectl get nodes -v=10
・・・【略】・・・
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 100m v1.30.0
これはもしかしてworkerが無いからかもと思いました。
worker nodeにて
worker nodeでjoinしようとすると、うまくjoinできない。
$ sudo kubeadm join 192.168.40.146:6443 --token ughrnr.kpuu6massdmexxxx --discovery-token-ca-cert-hash sha256:8f20cc7a6eb434ec78910239ca67bc65cae528658f91defeb9456d13b77xxxx --cri-socket=unix:///var/run/crio/crio.sock
[preflight] Running pre-flight checks
[WARNING SystemVerification]: missing optional cgroups: hugetlb
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 505.642686ms
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap
error execution phase kubelet-start: error uploading crisocket: Unauthorized
To see the stack trace of this error execute with --v=5 or higher
詳しく見てみると(–v=5をつけて実行してみると)
xxx@k8s-worker1:~ $ sudo kubeadm join 192.168.40.146:6443 --token ughrnr.kpuu6massdmexxxx --discovery-token-ca-cert-hash sha256:8f20cc7a6eb434ec78910239ca67bc65cae528658f91defeb9456d13b77xxxx --cri-socket=unix:///var/run/crio/crio.sock --v=5
・・・【略】・・・
[preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
error execution phase preflight
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:260
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:446
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:232
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1
k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:183
github.com/spf13/cobra.(*Command).execute
github.com/spf13/cobra@v1.7.0/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
github.com/spf13/cobra@v1.7.0/command.go:1068
github.com/spf13/cobra.(*Command).Execute
github.com/spf13/cobra@v1.7.0/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:52
main.main
k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
runtime/proc.go:271
runtime.goexit
runtime/asm_arm64.s:1222
/etc/kubernetes/kubelet.conf とは
- 各ノード上で動作し、Pod の実行・監視・レポートを行う、Kubernetesコンポーネントであるkubeletついて定義するファイル
/etc/kubernetes/pki/ca.crt とは
- Kubernetes クラスター全体の認証局(CA: Certificate Authority)の公開鍵
- Kubernetes内部ではTLS 通信を行っており、その正当性をこの CA 証明書で検証します
上記で出てきた以下の3つのエラーにたいしては、
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
- /etc/kubernetes/kubelet.confのファイルを削除
- Port 10250のプロセスをkillする
- /etc/kubernetes/pki/ca.crtのファイルを削除
で一旦対応しました。
以下を実行して、再実行しました。
xxx@k8s-worker1:~ $ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: '/home/appare99/.kube/config' を上書きしますか? yes
しかし、これでもダメでした。
解決
そんなことしていたらなぜかmaster側はnodeを確認できる状態に…
xxx@k8s-master:~ $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 3h56m v1.30.0
なんで…
ただ、まだjoinはできていない。
以下を参考にしたらできた

Kubernetes- error uploading crisocket: timed out waiting for the condition
I am trying to create a template for a Kubernetes cluster having 1 master and 2 worker nodes. I have installed all the pre-req software and have run the kubeadm
worker nodeのrootユーザーで以下を実行
swapoff -a
kubeadm reset
systemctl daemon-reload
systemctl restart kubelet
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
master nodeから確認
xxx@k8s-master:~ $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 4h1m v1.30.0
k8s-worker1 Ready <none> 7s v1.30.0
クラスターができている。
最後に
なんかグダグダになってしまったけど、解決できた。
今後同様のエラーが出たらこの記事を見直して対応しておきたい。