ARM64安装高可用K8s+Kubesphere+排障过程
6
2023-03-18
ARM64安装高可用K8s+Kubesphere+排障过程。
成员集群安装
主机规划
kubesphere 4c4g Ubuntu22.04 master-1 4c4g Ubuntu 22.04 worker-1 4c8g Ubuntu 22.04vim /etc/hosts # 10.211.55.51 master-1 10.211.55.41 worker-1 10.211.55.50 kubesphere
获取模板镜像信息
这里我本地ParallelsDesktop制作好了基础镜像,只需克隆修改网络信息。 网卡文件:/etc/netplan/enp0s5-config.yamlnetwork: renderer: networkd ethernets: enp0s5: dhcp4: no optional: true addresses: [10.211.55.10/24] nameservers: addresses: [223.5.5.5, 223.6.6.6] routes: - to: default via: 10.211.55.1 version: 2
初始化主机
hostnamectl set-hostname master-1 \ && sed -i "s/10.211.55.10/10.211.55.51/g" /etc/netplan/*.yaml \ && netplan apply hostnamectl set-hostname worker-1 \ && sed -i "s/10.211.55.10/10.211.55.41/g" /etc/netplan/*.yaml \ && netplan apply hostnamectl set-hostname kubesphere \ && sed -i "s/10.211.55.10/10.211.55.50/g" /etc/netplan/*.yaml \ && netplan apply
安装ARM版sealos工具
wget https://oss.forwl.com/sealos_4.1.4_linux_arm64.tar.gz && tar zxvf sealos_4.1.4_linux_arm64.tar.gz sealos && chmod +x sealos && mv sealos /usr/bin
编写 Clusterfile
我本次选择使用的是containerd,镜像对应关系如下:
apiVersion: apps.sealos.io/v1beta1 kind: Cluster metadata: name: k8s-dev spec: hosts: - ips: - 10.211.55.51:22 roles: - master - arm64 - ips: - 10.211.55.41:22 roles: - node - arm64 image: - labring/kubernetes:v1.24.0 - labring/helm:v3.8.2 - labring/calico:v3.24.1 ssh: passwd: Admin@8080 pk: /root/.ssh/id_rsa port: 22 user: root
执行初始化
sealos apply -f Clusterfile
后续更改
首次apply之后的Clusterfile文件会在.sealos/集群名/Clusterfile
中,比如这里增加删除节点都可以编辑 .sealos/k8s-dev/Clusterfile
文件然后重新 sealos apply -f Clusterfile
即可。
# apply之前需要清除metadata.creationTimestamp sed -i '/creationTimestamp/d' sealos apply -f /root/.sealos/k8s-dev/Clusterfile
检测输出

watch -d -n1 kubectl get pod -A

设置命令自动补全
yum install -y bash-completion source /usr/share/bash-completion/bash_completion source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc
管控集群安装
在生产环境中,由于单节点集群资源有限、计算能力不足,无法满足大部分需求,因此不建议在处理大规模数据时使用单节点集群。此外,单节点集群只有一个节点,因此也不具有高可用性。相比之下,在应用程序部署和分发方面,多节点架构是最常见的首选架构。由于Kubesphere的多节点安装CPU 必须为 x86_64,暂时不支持 Arm 架构的 CPU,本次采用KubeKey进行AllinOne安装。
安装KubeKey进行初始化
KubeKey 是用 Go 语言开发的一款全新的安装工具,代替了以前基于 ansible 的安装程序。KubeKey 为用户提供了灵活的安装选择,可以分别安装 KubeSphere 和 Kubernetes 或二者同时安装,既方便又高效。
# 安装必要依赖 apt update -y apt-get install socat conntrack ebtables ipset -y # 大陆地区 export KKZONE=cn curl -sfL https://get-kk.kubesphere.io | VERSION=v3.0.2 bash - # 安装 chmod +x kk ./kk create cluster --with-kubernetes v1.22.12 --with-kubesphere v3.3.1 # 检测 kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f
检测
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f kubectl get pod -A查看各个组件运行情况


处理报错-镜像缺失
这里default-http-backend用的是amd64的镜像,而我们的机器是arm的,需要找替换镜像。$ kubectl describe pod default-http-backend-5f56fb595-nbqcm -n kubesphere-controls-system --- Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 11m default-scheduler Successfully assigned kubesphere-controls-system/default-http-backend-5f56fb595-nbqcm to kubesphere Normal Pulled 9m24s (x5 over 11m) kubelet Container image "registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4" already present on machine Normal Created 9m24s (x5 over 11m) kubelet Created container default-http-backend Normal Started 9m22s (x5 over 11m) kubelet Started container default-http-backend Warning BackOff 62s (x49 over 11m) kubelet Back-off restarting failed container最终在社区找到了
mirrorgooglecontainers/defaultbackend-arm64:1.4
$ kubectl get deployments.apps -n kubesphere-controls-system NAME READY UP-TO-DATE AVAILABLE AGE default-http-backend 0/1 1 0 57m kubectl-admin 1/1 1 1 52m $ kubectl edit deployments.apps default-http-backend -n kubesphere-controls-system # 找到registry.cn-beijing.aliyuncs.com/kubesphereio/defaultbackend-amd64:1.4 # 替换为mirrorgooglecontainers/defaultbackend-arm64:1.4检查,成功解决。
$ kubectl get pod -n kubesphere-controls-system NAME READY STATUS RESTARTS AGE default-http-backend-6f5479966-xnzzh 1/1 Running 0 41s kubectl-admin-7685cdd85b-zglqw 1/1 Running 2 (23m ago) 54m
处理报错-资源不足
kubectl describe pod prometheus-k8s-0 -n kubesphere-monitoring-system kubectl describe pod alertmanager-main-0 -n kubesphere-monitoring-system或者通过控制台查看




确认状态


配置钉钉通知
参考链接:https://kubesphere.io/zh/docs/v3.3/cluster-administration/platform-settings/notification-management/configure-dingtalk/ 输入应用Key、Secret、ChatID(根据钉钉文档post获取返回值)
测试通知


设置Kubesphere主集群
我们已经安装了独立的 KubeSphere 集群,编辑集群配置,将clusterRole
的值设置为 host
即可。
进入web控制台,使用admin登录,在集群管理页面点击定制资源定义,搜索ClusterConfiguration

编辑YAML文件
搜索multicluster
,将none改为host,并添加hostClusterName
设置主集群名称。

检查工作负载

重新登录测试
访问http://10.211.55.50:30880/clusters可以看到出现了添加集群的功能,并且主机群成功更改为kubesphere-0
添加成员集群
在成员集群安装Kubesphere最小安装包
在需要添加的成员集群master节点执行:kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/kubesphere-installer.yaml kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/cluster-configuration.yaml
处理报错-存储类缺失
$ kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f --- TASK [preinstall : KubeSphere | Stopping if default StorageClass was not found] *** fatal: [localhost]: FAILED! => { "assertion": "\"(default)\" in default_storage_class_check.stdout", "changed": false, "evaluated_to": false, "msg": "Default StorageClass was not found !" }创建StorageClass
vim default-storage-class.yaml
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: local annotations: cas.openebs.io/config: | - name: StorageType value: "hostpath" - name: BasePath value: "/var/openebs/local/" kubectl.kubernetes.io/last-applied-configuration: > {"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"cas.openebs.io/config":"- name: StorageType\n value: \"hostpath\"\n- name: BasePath\n value: \"/var/openebs/local/\"\n","openebs.io/cas-type":"local","storageclass.beta.kubernetes.io/is-default-class":"true","storageclass.kubesphere.io/supported-access-modes":"[\"ReadWriteOnce\"]"},"name":"local"},"provisioner":"openebs.io/local","reclaimPolicy":"Delete","volumeBindingMode":"WaitForFirstConsumer"} openebs.io/cas-type: local storageclass.beta.kubernetes.io/is-default-class: 'true' storageclass.kubesphere.io/supported-access-modes: '["ReadWriteOnce"]' provisioner: openebs.io/local reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer创建StorageClass
kubectl apply -f default-storage-class.yaml重新安装KubeSphere
kubectl delete -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/kubesphere-installer.yaml kubectl delete -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/cluster-configuration.yaml kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/kubesphere-installer.yaml kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/cluster-configuration.yaml查看安装日志
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f重新部署
kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/kubesphere-installer.yaml kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.1/cluster-configuration.yaml
访问WEB测试


设置集群名称与对应环境
供应商我们这里由于是本地自建,不用选。
获取主集群密钥
在Kubesphere主机群执行$ kubectl -n kubesphere-system get cm kubesphere-config -o yaml | grep -v "apiVersion" | grep jwtSecret jwtSecret: tNvhoEQ0PPhnHs1etlvGp5xUkV75te7g
同上修改ks-installer.yaml
在成员集群的WEB控制台或master节点执行kubectl edit cc ks-installer -n kubesphere-system
# 修改以下俩个参数 authentication: jwtSecret: tNvhoEQ0PPhnHs1etlvGp5xUkV75te7g --- multicluster: clusterRole: member
添加成员集群
在需要纳管的集群master节点执行以下命令kubectl config view --minify --raw修改
apiserver.cluster.local
字段为master的出口IP填入


- 0
-
分享