本文记录 KubeEdge 部署到 armv7l 平台的过程。
docker 环境 ARM 系统需要搭建 docker 环境。需要添加驱动的支持。过程较复杂,此处不提。
交叉编译 只需要编译边缘端即可: 安装交叉编译器:
1 sudo apt-get install gcc-arm-linux-gnueabihf
设置环境变量并编译:
1 2 3 4 5 6 7 8 export GOARCH=arm export GOOS="linux" export GOARM=7 export CGO_ENABLED=1 export CC=arm-linux-gnueabihf-gcc export GO111MODULE=off make all WHAT=edgecore
注意,如果机器内存过小,编译 go 会出现 Killed 错误。(在编译时查看内存,最大占用2GB,官方不建议在目标板编译,也是这个原因)。 另外,交叉编译器也可以使用非 hf 版本。
1 2 3 sudo apt-get install gcc-arm-linux-gnueabihf export CC=arm-linux-gnueabi-gcc // 其它相同
注意:交叉编译器最好与目标板系统构建的交叉编译器版本一致,如果不一致,也需要进行测试。本文目标板使用gcc 8构建,但用 ubuntu 系统自带版本也可以在目标板上运行。
部署 部署过程,同 x86 平台一样。 但要注意,docker需要修改。edge.yaml文件的容器需要修改。主要是节点名称,如url、node-id、hostname-override等出现的节点名称要保持一致(且与集群中其它节点区别)。另外 podsandbox-image 需要使用 arm 版本。
云端查看:
1 2 3 4 # kubectl get nodes NAME STATUS ROLES AGE VERSION edge-node-arm Ready edge 46h v1.15.3-kubeedge-v1.1.0-beta.0.323+52dd841358b292 ubuntu Ready master 7d23h v1.17.0
边缘端:
1 2 3 4 # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 70fa72886761 nginx "nginx -g 'daemon ..." 23 seconds ago Up 17 seconds k8s_nginx_nginx-deployment-77698bff7d-q5shs_default_236ddc3e-17de-4d61-92bc-4dd86d9dad92_0 4a4013a1a488 kubeedge/pause-arm:3.1 "/pause" 3 minutes ago Up 3 minutes 0.0.0.0:80->80/tcp k8s_POD_nginx-deployment-77698bff7d-q5shs_default_236ddc3e-17de-4d61-92bc-4dd86d9dad92_0
所遇问题 官方编译的arm版本 使用官方编译的arm版本,无法在板子上跑起来,提示coredump。目测是编译器、链接库版本兼容性引起的。故需要自行交叉编译。
交叉编译 默认 GO111MODULE=auto,编译下载依赖包,有些无法下载,失败。 关闭,再编译,可成功(存疑:此时似乎没有下载依赖包,但亦能编译通过,暂未知原因)。
运行问题 12.31日编译: 边缘端执行信息:
1 2 RoundRobin. W1231 17:01:33.736283 625 proxy.go:78] [L4 Proxy] create Device is failed : operation not supported
原因及解决: 看日志似乎是权限原因,但已使用 root 运行。见下解决。
使用1.6日主分支版本编译,错误如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 [L4 Proxy] Add ip is failed : [L4 Proxy] Device edge0 is not exist!! please checkout the env panic: runtime error: index out of range goroutine 119 [running]: github.com/kubeedge/kubeedge/edgemesh/pkg/proxy.addServer(0x3f61120, 0x14, 0x427e300, 0x2, 0x2) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/proxy/proxy.go:388 +0x8f4 github.com/kubeedge/kubeedge/edgemesh/pkg/proxy.updateServer(0x3f61120, 0x14, 0x427e300, 0x2, 0x2, 0x427e308, 0x0) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/proxy/proxy.go:443 +0x508 github.com/kubeedge/kubeedge/edgemesh/pkg/proxy.MsgProcess(0x3e96ba0, 0x24, 0x0, 0x0, 0x7a370c50, 0x16f, 0x0, 0x3ef3b50, 0xe, 0x3ef3b60, ...) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/proxy/proxy.go:359 +0x520 github.com/kubeedge/kubeedge/edgemesh/pkg.(*EdgeMesh).Start(0x3b0f934) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/module.go:51 +0x190 created by github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core.StartModules /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core/core.go:23 +0x11c
原因及解决: 在 ubuntu 边缘端查看 edge0 设备,结果如下:
1 2 3 # find / -name "edge0" /sys/devices/virtual/net/edge0 /sys/class/net/edge0
猜测是无法创建 edge0 设备出错。 越界问题,proxy.go 文件 addServer
1 2 3 4 5 6 7 8 9 10 } else { truetrueif len(ports) == 0 { truetruetruereturn truetrue} truetrueif len(unused) == 0 { truetruetrueexpandIpPool() truetrue} truetrueip = unused[0] truetrueunused = unused[1:] true}
出错,原因 unused 数组长度为 0,取之越界。判断其长度不为0时才获取。 修改后,无越界,但依然有错:
1 2 3 4 5 6 7 8 9 I0106 22:44:28.682395 3562 generic.go:81] GenericLifecycle: Relisting W0106 22:44:29.026038 3562 proxy.go:76] [L4 Proxy] create Device is failed : operation not supported I0106 22:44:29.689864 3562 generic.go:81] GenericLifecycle: Relisting I0106 22:44:30.697860 3562 generic.go:81] GenericLifecycle: Relisting W0106 22:44:31.068954 3562 proxy.go:76] [L4 Proxy] create Device is failed : operation not supported I0106 22:44:31.705060 3562 generic.go:81] GenericLifecycle: Relisting I0106 22:44:32.724408 3562 generic.go:81] GenericLifecycle: Relisting W0106 22:44:33.117475 3562 proxy.go:76] [L4 Proxy] create Device is failed : operation not supported I0106 22:44:33.307057 3562 communicate.go:148] CheckConfirm
原因与前述一致。权限问题,无法创建虚拟网络设备 edge0。 但是节点为 Ready 状态。应该还有问题:
1 2 3 4 5 # kubectl get nodes NAME STATUS ROLES AGE VERSION edge-node Ready edge 6d5h v1.15.3-kubeedge-v1.1.0-beta.0.323+52dd841358b292 edge-node-arm Ready edge 6h6m v1.15.3-kubeedge-v1.1.0-beta.0.323+52dd841358b292-dirty ubuntu Ready master 6d7h v1.17.0
日志分析:
1 2 3 4 5 6 7 8 9 10 11 12 cni.go:213] Unable to update cni config: No networks found in /etc/cni/net.d hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup. --> http://conntrack-tools.netfilter.org/downloads.html container_manager_linux.go:295] Creating device plugin manager: false cpu_manager.go:135] [cpumanager] Unknown policy "", falling back to default policy "none" csi_plugin.go:222] kubernetes.io/csi: kubeclient not set, assuming standalone kubelet plugin_watcher.go:81] failed to traverse deprecated plugin socket path "/var/lib/edged/plugins", err: error accessing path: /var/lib/edged/plugins error: lstat /var/lib/edged/plugins: no such file or directory
分析vishvananda源码: /dev/net/tun 不存在,ubuntu上存在 crw-rw-rw- 1 root root 10, 200 Jan 6 16:25 /dev/net/tun
原因:CONFIG_TUN 没有配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Device Drivers ---> [*] Network device support ---> --- Network device support [*] Network core driver support <M> Bonding driver support <M> Dummy net driver support <M> EQL (serial line load balancing) support <M> Ethernet team driver support ---> // 注:其下的全选为M <M> MAC-VLAN support <M> MAC-VLAN based tap driver <M> Virtual eXtensible Local Area Network (VXLAN) <M> Generic Network Virtualization Encapsulation <M> GPRS Tunneling Protocol datapath (GTP-U) <M> IEEE 802.1AE MAC-level encryption (MACsec) <M> Network console logging support [*] Dynamic reconfiguration of logging targets <*> Universal TUN/TAP device driver support // !! 这就是 CONFIG_TUN [ ] Support for cross-endian vnet headers on little-endian kernels <*> Virtual ethernet pair device <M> Virtual netlink monitoring device
3.17 段错误:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 I0130 05:44:31.614058 9380 client.go:143] finish hub-client pub I0130 05:44:31.659321 9380 eventbus.go:61] Init Sub And Pub Client for externel mqtt broker tcp://127.0.0.1:1883 successfully I0130 05:44:31.658978 9380 client.go:86] edge-hub-cli subscribe topic to $hw/events/device/+/twin/+ I0130 05:44:31.693166 9380 client.go:86] edge-hub-cli subscribe topic to $hw/events/node/+/membership/get I0130 05:44:31.739451 9380 client.go:86] edge-hub-cli subscribe topic to SYS/dis/upload_records I0130 05:44:31.781856 9380 proxy.go:92] [L4 Proxy] proxy is running now I0130 05:44:31.868529 9380 cpu_manager.go:173] [cpumanager] starting with none policy I0130 05:44:31.868783 9380 cpu_manager.go:174] [cpumanager] reconciling every 0s I0130 05:44:31.868938 9380 policy_none.go:43] [cpumanager] none policy: Start panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x14 pc=0x2050dd4] goroutine 93 [running]: github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm.(*containerManagerImpl).enforceNodeAllocatableCgroups(0x44c23c0, 0x6, 0x0) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm/node_container_manager_linux.go:78 +0x144 github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm.(*containerManagerImpl).setupNode(0x44c23c0, 0x47165d0, 0x0, 0x11) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm/container_manager_linux.go:452 +0xa8 github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm.(*containerManagerImpl).Start(0x44c23c0, 0x0, 0x47165d0, 0x2c64348, 0x4144500, 0xa5334f80, 0x4827c80, 0x2cb2be8, 0x4664720, 0x46cf960, ...) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm/container_manager_linux.go:600 +0xd4 github.com/kubeedge/kubeedge/edge/pkg/edged.(*edged).initializeModules(0x44dc600, 0x4827c80, 0x461c440) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edge/pkg/edged/edged.go:564 +0xf4 github.com/kubeedge/kubeedge/edge/pkg/edged.(*edged).Start(0x44dc600) /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edge/pkg/edged/edged.go:264 +0x1f8 created by github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core.StartModules /home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core/core.go:23 +0x11c
解决:添加默认网关。 如果没有默认网关,使用edgecore --minconfig
输出的信息没有IP地址,此时出现段错误。实际上,程序通过默认网关获取IP,但没有的话,则获取不了IP,node结构体为空,但代码未判断,故出错。
1 E0129 00:02:12.582898 520 edged_status.go:371] register node failed, error: <nil>
主机名称不合法,必须是小写字母、数字,其它字符只能是-
或.
(下划线也不行),而且名称的开头和结尾必须是小写字母。(注:这是k8s dns命名的一个规范)。
参考