RancherでラズパイRKE2クラスタ構築

前回ラズパイで構築したRancher2.8.2 で ラズパイRKE2クラスタを構築してみました。
RKE1の構築手順はこちら に書いてあります。

ノード準備

3台のRaspberry Pi 4に
Raspberry Pi ImagerでUbuntu server 22.04をインストールします。

root@k8s1:~# cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

OS設定は固定IPにして全ノードのhostsに以下を追記したくらいです。

192.168.0.51  k8s1
192.168.0.52  k8s2
192.168.0.53  k8s3
192.168.0.202 rancher.tsuchinokometal.com

クラスタ構築

Rancher GUIから作成します。
RKE2/K3sになっていることを確認し、Customを選択します。

rancher_rke2_downstream_cluster_on_raspberrypi_01.png

クラスター名だけ入力し、あとはデフォルトのままで行きます。

rancher_rke2_downstream_cluster_on_raspberrypi_02.png

表示されたコマンドをコントロールプレーンにする予定のノードで実行します。

rancher_rke2_downstream_cluster_on_raspberrypi_03.png

root@k8s1:~# curl --insecure -fL https://rancher.tsuchinokometal.com/system-agent-install.sh | sudo  sh -s - --server https://rancher.tsuchinokometal.com --label 'cattle.io/os=linux' --token cwtfmsp9v9lw2fmvppjzx25pvrgh8z4hlmxg7gdbsz2pz86cqnfsg7 --ca-checksum 4f7975c79669bfb08b1b465dd3b7ead35d420be1b4533f5dd48f86f35565090d --etcd --controlplane
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 30879    0 30879    0     0   657k      0 --:--:-- --:--:-- --:--:--  670k
[INFO]  Label: cattle.io/os=linux
[INFO]  Role requested: etcd
[INFO]  Role requested: controlplane
[INFO]  Using default agent configuration directory /etc/rancher/agent
[INFO]  Using default agent var directory /var/lib/rancher/agent
[INFO]  Determined CA is necessary to connect to Rancher
[INFO]  Successfully downloaded CA certificate
[INFO]  Value from https://rancher.tsuchinokometal.com/cacerts is an x509 certificate
[INFO]  Successfully tested Rancher connection
[INFO]  Downloading rancher-system-agent binary from https://rancher.tsuchinokometal.com/assets/rancher-system-agent-arm64
[INFO]  Successfully downloaded the rancher-system-agent binary.
[INFO]  Downloading rancher-system-agent-uninstall.sh script from https://rancher.tsuchinokometal.com/assets/system-agent-uninstall.sh
[INFO]  Successfully downloaded the rancher-system-agent-uninstall.sh script.
[INFO]  Generating Cattle ID
[INFO]  Successfully downloaded Rancher connection information
[INFO]  systemd: Creating service file
[INFO]  Creating environment file /etc/systemd/system/rancher-system-agent.env
[INFO]  Enabling rancher-system-agent.service
Created symlink /etc/systemd/system/multi-user.target.wants/rancher-system-agent.service → /etc/systemd/system/rancher-system-agent.service.
[INFO]  Starting/restarting rancher-system-agent.service

Provisioning Logに以下のメッセージも表示されますので、続けてWorker用のコマンドも別ノードで実行します。

waiting for at least one control plane, etcd, and worker node to be registered
rancher_rke2_downstream_cluster_on_raspberrypi_04.png
root@k8s2:~# curl --insecure -fL https://rancher.tsuchinokometal.com/system-agent-install.sh | sudo  sh -s - --server https://rancher.tsuchinokometal.com --label 'cattle.io/os=linux' --token cwtfmsp9v9lw2fmvppjzx25pvrgh8z4hlmxg7gdbsz2pz86cqnfsg7 --ca-checksum 4f7975c79669bfb08b1b465dd3b7ead35d420be1b4533f5dd48f86f35565090d --worker
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 30879    0 30879    0     0   844k      0 --:--:-- --:--:-- --:--:--  861k
[INFO]  Label: cattle.io/os=linux
[INFO]  Role requested: worker
[INFO]  Using default agent configuration directory /etc/rancher/agent
[INFO]  Using default agent var directory /var/lib/rancher/agent
[INFO]  Determined CA is necessary to connect to Rancher
[INFO]  Successfully downloaded CA certificate
[INFO]  Value from https://rancher.tsuchinokometal.com/cacerts is an x509 certificate
[INFO]  Successfully tested Rancher connection
[INFO]  Downloading rancher-system-agent binary from https://rancher.tsuchinokometal.com/assets/rancher-system-agent-arm64
[INFO]  Successfully downloaded the rancher-system-agent binary.
[INFO]  Downloading rancher-system-agent-uninstall.sh script from https://rancher.tsuchinokometal.com/assets/system-agent-uninstall.sh
[INFO]  Successfully downloaded the rancher-system-agent-uninstall.sh script.
[INFO]  Generating Cattle ID
[INFO]  Successfully downloaded Rancher connection information
[INFO]  systemd: Creating service file
[INFO]  Creating environment file /etc/systemd/system/rancher-system-agent.env
[INFO]  Enabling rancher-system-agent.service
Created symlink /etc/systemd/system/multi-user.target.wants/rancher-system-agent.service → /etc/systemd/system/rancher-system-agent.service.
[INFO]  Starting/restarting rancher-system-agent.service

しばらく待っていたらCLI Toolsがダウンロードされていると思いますので、
kubectlを実行します。
使い方はこちら にドキュメントがあります。

root@k8s1:~# export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
root@k8s1:~# /var/lib/rancher/rke2/bin/kubectl get nodes
NAME   STATUS     ROLES                       AGE     VERSION
k8s1   NotReady   control-plane,etcd,master   2m49s   v1.27.10+rke2r1
root@k8s1:~# /var/lib/rancher/rke2/bin/kubectl get pod -A
NAMESPACE         NAME                                                    READY   STATUS              RESTARTS      AGE
cattle-system     cattle-cluster-agent-8559cd977b-8wv5p                   0/1     Pending             0             94s
kube-system       cloud-controller-manager-k8s1                           1/1     Running             1 (52s ago)   109s
kube-system       etcd-k8s1                                               1/1     Running             0             2m26s
kube-system       helm-install-rke2-calico-crd-fqzbn                      0/1     Completed           0             95s
kube-system       helm-install-rke2-calico-tdm7d                          0/1     Completed           2             95s
kube-system       helm-install-rke2-coredns-ccsgd                         0/1     Completed           0             95s
kube-system       helm-install-rke2-ingress-nginx-n7txp                   0/1     Pending             0             95s
kube-system       helm-install-rke2-metrics-server-zgnrh                  0/1     Pending             0             95s
kube-system       helm-install-rke2-snapshot-controller-crd-r97wd         0/1     Pending             0             95s
kube-system       helm-install-rke2-snapshot-controller-xcstl             0/1     Pending             0             95s
kube-system       helm-install-rke2-snapshot-validation-webhook-7gg7m     0/1     Pending             0             95s
kube-system       kube-apiserver-k8s1                                     1/1     Running             0             2m13s
kube-system       kube-controller-manager-k8s1                            1/1     Running             2 (52s ago)   2m26s
kube-system       kube-proxy-k8s1                                         1/1     Running             0             2m7s
kube-system       kube-scheduler-k8s1                                     1/1     Running             2 (42s ago)   2m26s
kube-system       rke2-coredns-rke2-coredns-84f49dccc9-g5r75              0/1     Pending             0             7s
kube-system       rke2-coredns-rke2-coredns-autoscaler-5b5b56997b-lkvn9   0/1     Pending             0             7s
tigera-operator   tigera-operator-5b8fcdd5f6-d6j6t                        0/1     ContainerCreating   0             7s

前回 と同じく名前解決できず、cattle-cluster-agentが
起動できなかったためCoreDNSの設定を変更します。

root@k8s1:~# /var/lib/rancher/rke2/bin/kubectl get configmap -n kube-system rke2-coredns-rke2-coredns -o yaml
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
          lameduck 5s
        }
        hosts {
          192.168.0.202 rancher.tsuchinokometal.com
          fallthrough
        }
        ready
        kubernetes cluster.local cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
        }
        prometheus 0.0.0.0:9153
        forward . 8.8.8.8 8.8.4.4
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: rke2-coredns
    meta.helm.sh/release-namespace: kube-system
  creationTimestamp: "2024-02-19T10:44:13Z"
  labels:
    app.kubernetes.io/instance: rke2-coredns
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: rke2-coredns
    helm.sh/chart: rke2-coredns-1.24.008
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: CoreDNS
  name: rke2-coredns-rke2-coredns
  namespace: kube-system
  resourceVersion: "9467"
  uid: 4139ae57-c82a-4136-8a78-815c2003781b

構築できたように見えますが、calicoがおかしいですね。

root@k8s1:~# /var/lib/rancher/rke2/bin/kubectl get pod -A
NAMESPACE         NAME                                                    READY   STATUS      RESTARTS      AGE
calico-system     calico-kube-controllers-66cd4f87fb-lr427                1/1     Running     0             52m
calico-system     calico-node-5cqm7                                       0/1     Running     0             12m
calico-system     calico-node-x6j7z                                       1/1     Running     0             52m
calico-system     calico-typha-5569bc5989-vxdbp                           1/1     Running     0             52m
cattle-system     cattle-cluster-agent-666bfc877c-5jvxx                   1/1     Running     0             13m
cattle-system     cattle-cluster-agent-666bfc877c-7kpsp                   1/1     Running     1 (11m ago)   12m
cattle-system     helm-operation-7qv26                                    0/2     Completed   0             11m
cattle-system     helm-operation-7xxnd                                    1/2     Error       0             10m
cattle-system     rancher-webhook-7dc6679459-vm6zf                        1/1     Running     0             7m51s
kube-system       cloud-controller-manager-k8s1                           1/1     Running     5 (46m ago)   54m
kube-system       etcd-k8s1                                               1/1     Running     0             54m
kube-system       helm-install-rke2-calico-crd-fqzbn                      0/1     Completed   0             54m
kube-system       helm-install-rke2-calico-tdm7d                          0/1     Completed   2             54m
kube-system       helm-install-rke2-coredns-ccsgd                         0/1     Completed   0             54m
kube-system       helm-install-rke2-ingress-nginx-n7txp                   0/1     Completed   0             54m
kube-system       helm-install-rke2-metrics-server-zgnrh                  0/1     Completed   0             54m
kube-system       helm-install-rke2-snapshot-controller-crd-r97wd         0/1     Completed   0             54m
kube-system       helm-install-rke2-snapshot-controller-xcstl             0/1     Completed   1             54m
kube-system       helm-install-rke2-snapshot-validation-webhook-7gg7m     0/1     Completed   0             54m
kube-system       kube-apiserver-k8s1                                     1/1     Running     0             54m
kube-system       kube-controller-manager-k8s1                            1/1     Running     5 (47m ago)   54m
kube-system       kube-proxy-k8s1                                         1/1     Running     0             54m
kube-system       kube-proxy-k8s2                                         1/1     Running     0             11m
kube-system       kube-scheduler-k8s1                                     1/1     Running     5 (47m ago)   54m
kube-system       rke2-coredns-rke2-coredns-86477864d6-glx25              1/1     Running     0             15m
kube-system       rke2-coredns-rke2-coredns-86477864d6-jtj2s              1/1     Running     0             12m
kube-system       rke2-coredns-rke2-coredns-autoscaler-5b5b56997b-lkvn9   1/1     Running     0             52m
kube-system       rke2-ingress-nginx-controller-rpkzt                     1/1     Running     0             7m13s
kube-system       rke2-metrics-server-5c9768ff67-gqsbj                    1/1     Running     0             8m19s
kube-system       rke2-snapshot-controller-7d6476d7cb-ldcng               1/1     Running     0             7m44s
kube-system       rke2-snapshot-validation-webhook-5649fbd66c-lcchb       1/1     Running     0             8m18s
tigera-operator   tigera-operator-5b8fcdd5f6-d6j6t                        1/1     Running     3 (47m ago)   52m

Provisioning Logも怪しい感じです。

[INFO ] configuring bootstrap node(s) custom-1394a944be9c: waiting for probes: calico
[INFO ] provisioning done
[INFO ] configuring worker node(s) custom-d29c0624d485: waiting for probes: calico
[INFO ] provisioning done
[INFO ] configuring bootstrap node(s) custom-1394a944be9c: waiting for probes: calico
[INFO ] provisioning done
[INFO ] configuring bootstrap node(s) custom-1394a944be9c: waiting for probes: calico
[INFO ] custom-1394a944be9c
[INFO ] provisioning done
[INFO ] configuring bootstrap node(s) custom-1394a944be9c: waiting for probes: calico
[INFO ] provisioning done
[INFO ] configuring bootstrap node(s) custom-1394a944be9c: waiting for probes: calico
[INFO ] provisioning done
[INFO ] configuring worker node(s) custom-d29c0624d485: waiting for probes: calico
[INFO ] provisioning done
[INFO ] configuring worker node(s) custom-d29c0624d485: waiting for probes: calico
[INFO ] custom-d29c0624d485
[INFO ] provisioning done

calico修正

calico-nodeのログを見ると、以下のエラーでループしているように見えます。

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x178fe00]

goroutine 112 [running]:
github.com/vishvananda/netlink.(*Handle).newNetlinkRequest(...)
	/go/pkg/mod/github.com/vishvananda/netlink@v1.2.1-beta.2.0.20230206183746-70ca0345eede/handle_linux.go:172
github.com/vishvananda/netlink.(*Handle).LinkList(0x4000174fc0?)
	/go/pkg/mod/github.com/vishvananda/netlink@v1.2.1-beta.2.0.20230206183746-70ca0345eede/link_linux.go:2099 +0x20
github.com/projectcalico/calico/felix/dataplane/linux.(*vxlanManager).getParentInterface(0x40004fdc00, 0x40003a2e00)
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/vxlan_mgr.go:581 +0x108
github.com/projectcalico/calico/felix/dataplane/linux.(*vxlanManager).getLocalVTEPParent(0x3b9aca00?)
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/vxlan_mgr.go:359 +0x2c
github.com/projectcalico/calico/felix/dataplane/linux.(*vxlanManager).KeepVXLANDeviceInSync(0x40004fdc00, 0x5aa?, 0x1, 0x2540be400?)
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/vxlan_mgr.go:544 +0x1f4
created by github.com/projectcalico/calico/felix/dataplane/linux.NewIntDataplaneDriver in goroutine 1
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:509 +0x1200

調べてみると、linux-modules-extra-raspiをインストールするのが良さそうです。
https://github.com/canonical/microk8s/issues/2680#issuecomment-963581204
https://qiita.com/showchan33/items/5250f518eb03858a0c25

以下のコマンドでインストールしたら、エラーが出なくなりました。

root@k8s1:~# apt update && apt install -y linux-modules-extra-raspi
root@k8s1:~# reboot

良さそうです。

root@k8s1:~# kubectl get pod -A
NAMESPACE         NAME                                                    READY   STATUS      RESTARTS      AGE
calico-system     calico-kube-controllers-66cd4f87fb-lr427                1/1     Running     1 (35m ago)   3h16m
calico-system     calico-node-5cqm7                                       1/1     Running     1 (53m ago)   156m
calico-system     calico-node-6brhg                                       1/1     Running     1 (63m ago)   84m
calico-system     calico-node-x6j7z                                       1/1     Running     9 (35m ago)   3h16m
calico-system     calico-typha-5569bc5989-2g5mh                           1/1     Running     2 (53m ago)   84m
calico-system     calico-typha-5569bc5989-vxdbp                           1/1     Running     1 (35m ago)   3h16m
cattle-system     cattle-cluster-agent-8559cd977b-5wpzt                   1/1     Running     0             34m
cattle-system     cattle-cluster-agent-8559cd977b-wnzfx                   1/1     Running     0             30m
cattle-system     rancher-webhook-7dc6679459-vm6zf                        1/1     Running     1 (53m ago)   152m
default           nginx-77b4fdf86c-kqtql                                  1/1     Running     1 (63m ago)   77m
kube-system       cloud-controller-manager-k8s1                           1/1     Running     9 (35m ago)   3h18m
kube-system       etcd-k8s1                                               1/1     Running     2             3h19m
kube-system       helm-install-rke2-calico-crd-fqzbn                      0/1     Completed   0             3h18m
kube-system       helm-install-rke2-calico-tdm7d                          0/1     Completed   2             3h18m
kube-system       helm-install-rke2-coredns-ccsgd                         0/1     Completed   0             3h18m
kube-system       helm-install-rke2-ingress-nginx-n7txp                   0/1     Completed   0             3h18m
kube-system       helm-install-rke2-metrics-server-zgnrh                  0/1     Completed   0             3h18m
kube-system       helm-install-rke2-snapshot-controller-crd-r97wd         0/1     Completed   0             3h18m
kube-system       helm-install-rke2-snapshot-controller-xcstl             0/1     Completed   1             3h18m
kube-system       helm-install-rke2-snapshot-validation-webhook-7gg7m     0/1     Completed   0             3h18m
kube-system       kube-apiserver-k8s1                                     1/1     Running     2             3h19m
kube-system       kube-controller-manager-k8s1                            1/1     Running     9 (35m ago)   3h19m
kube-system       kube-proxy-k8s1                                         1/1     Running     1 (35m ago)   3h19m
kube-system       kube-proxy-k8s2                                         1/1     Running     1 (53m ago)   155m
kube-system       kube-proxy-k8s3                                         1/1     Running     1 (63m ago)   84m
kube-system       kube-scheduler-k8s1                                     1/1     Running     7 (35m ago)   3h19m
kube-system       rke2-coredns-rke2-coredns-86477864d6-glx25              1/1     Running     1 (35m ago)   160m
kube-system       rke2-coredns-rke2-coredns-86477864d6-jtj2s              1/1     Running     1 (53m ago)   156m
kube-system       rke2-coredns-rke2-coredns-autoscaler-5b5b56997b-lkvn9   1/1     Running     1 (35m ago)   3h17m
kube-system       rke2-ingress-nginx-controller-rpkzt                     1/1     Running     1 (53m ago)   151m
kube-system       rke2-ingress-nginx-controller-x5kgm                     1/1     Running     1 (63m ago)   82m
kube-system       rke2-metrics-server-5c9768ff67-gqsbj                    1/1     Running     1 (53m ago)   153m
kube-system       rke2-snapshot-controller-7d6476d7cb-ldcng               1/1     Running     7 (35m ago)   152m
kube-system       rke2-snapshot-validation-webhook-5649fbd66c-lcchb       1/1     Running     2 (52m ago)   153m
tigera-operator   tigera-operator-5b8fcdd5f6-d6j6t                        1/1     Running     5 (35m ago)   3h17m

なんとか構築できたようです。

rancher_rke2_downstream_cluster_on_raspberrypi_05.png