openebs-zfs-plugin zfs-driver uses wrong lib versions

johnhainline

Cadet
Joined
Dec 31, 2020
Messages
2
Hi! I'm new here (and new to FreeNAS/TrueNAS) and I recently put TrueNAS SCALE on a Dell Poweredge r730. I've been experimenting with helm/kubernetes on this OS and came across a problem I'm unsure how to resolve.

My zfs is currently a single disk with a zpool mounted at /mnt/main

Upon startup I noticed the following:

$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system openebs-zfs-controller-0 5/5 Running 15 22h
kube-system coredns-66c464876b-lh8pz 1/1 Running 7 3d3h
kube-system openebs-zfs-node-xrks6 0/2 ContainerCreating 0 5m1s


as this was stuck at ContainerCreating, it was impossible to make a PersistentVolumeClaim and get one generated. I looked into it and got this:

$ kubectl describe pod openebs-zfs-node-xrks6 -n kube-system
Name: openebs-zfs-node-xrks6
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: ix-truenas/192.168.2.58
Start Time: Fri, 01 Jan 2021 14:08:19 -0800
Labels: app=openebs-zfs-node
controller-revision-hash=66fb8b8786
pod-template-generation=17
role=openebs-zfs
Annotations: kubectl.kubernetes.io/restartedAt: 2021-01-01T12:57:58-08:00
Status: Pending
IP: 192.168.2.58
IPs:
IP: 192.168.2.58
Controlled By: DaemonSet/openebs-zfs-node
Containers:
csi-node-driver-registrar:
Container ID:
Image: quay.io/k8scsi/csi-node-driver-registrar:v1.2.0
Image ID:
Port: <none>
Host Port: <none>
Args:
--v=5
--csi-address=$(ADDRESS)
--kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
ADDRESS: /plugin/csi.sock
DRIVER_REG_SOCK_PATH: /var/lib/kubelet/plugins/zfs-localpv/csi.sock
KUBE_NODE_NAME: (v1:spec.nodeName)
NODE_DRIVER: openebs-zfs
Mounts:
/plugin from plugin-dir (rw)
/registration from registration-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from openebs-zfs-node-sa-token-s2sgv (ro)
openebs-zfs-plugin:
Container ID:
Image: quay.io/openebs/zfs-driver:ci
Image ID:
Port: <none>
Host Port: <none>
Args:
--nodeid=$(OPENEBS_NODE_ID)
--endpoint=$(OPENEBS_CSI_ENDPOINT)
--plugin=$(OPENEBS_NODE_DRIVER)
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
OPENEBS_NODE_ID: (v1:spec.nodeName)
OPENEBS_CSI_ENDPOINT: unix:///plugin/csi.sock
OPENEBS_NODE_DRIVER: agent
OPENEBS_NAMESPACE: openebs
Mounts:
/dev from device-dir (rw)
/home/keys from encr-keys (rw)
/lib/libnvpair.so.1 from libnvpair (rw)
/lib/libuutil.so.1 from libuutil (rw)
/lib/libzfs.so.2 from libzfs (rw)
/lib/libzfs_core.so.1 from libzfscore (rw)
/lib/libzpool.so.2 from libzpool (rw)
/plugin from plugin-dir (rw)
/sbin/zfs from zfs-bin (rw)
/var/lib/kubelet/ from pods-mount-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from openebs-zfs-node-sa-token-s2sgv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
device-dir:
Type: HostPath (bare host directory volume)
Path: /dev
HostPathType: Directory
encr-keys:
Type: HostPath (bare host directory volume)
Path: /home/keys
HostPathType: DirectoryOrCreate
zfs-bin:
Type: HostPath (bare host directory volume)
Path: /sbin/zfs
HostPathType: File
libzpool:
Type: HostPath (bare host directory volume)
Path: /usr/lib/x86_64-linux-gnu/libzpool.so.2.0.0
HostPathType: File
libzfscore:
Type: HostPath (bare host directory volume)
Path: /usr/lib/x86_64-linux-gnu/libzfs_core.so.1.0.0
HostPathType: File
libzfs:
Type: HostPath (bare host directory volume)
Path: /usr/lib/x86_64-linux-gnu/libzfs.so.2.0.0
HostPathType: File
libuutil:
Type: HostPath (bare host directory volume)
Path: /usr/lib/x86_64-linux-gnu/libuutil.so.1.0.1
HostPathType: File
libnvpair:
Type: HostPath (bare host directory volume)
Path: /usr/lib/x86_64-linux-gnu/libnvpair.so.1.0.1
HostPathType: File
registration-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins_registry/
HostPathType: DirectoryOrCreate
plugin-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins/zfs-localpv/
HostPathType: DirectoryOrCreate
pods-mount-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/
HostPathType: Directory
openebs-zfs-node-sa-token-s2sgv:
Type: Secret (a volume populated by a Secret)
SecretName: openebs-zfs-node-sa-token-s2sgv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 69s Successfully assigned kube-system/openebs-zfs-node-xrks6 to ix-truenas
Warning FailedMount 61s (x5 over 69s) kubelet, ix-truenas MountVolume.SetUp failed for volume "libnvpair" : hostPath type check failed: /usr/lib/x86_64-linux-gnu/libnvpair.so.1.0.1 is not a file
Warning FailedMount 61s (x5 over 69s) kubelet, ix-truenas MountVolume.SetUp failed for volume "libzfscore" : hostPath type check failed: /usr/lib/x86_64-linux-gnu/libzfs_core.so.1.0.0 is not a file
Warning FailedMount 61s (x5 over 69s) kubelet, ix-truenas MountVolume.SetUp failed for volume "libzfs" : hostPath type check failed: /usr/lib/x86_64-linux-gnu/libzfs.so.2.0.0 is not a file
Warning FailedMount 61s (x5 over 69s) kubelet, ix-truenas MountVolume.SetUp failed for volume "libuutil" : hostPath type check failed: /usr/lib/x86_64-linux-gnu/libuutil.so.1.0.1 is not a file
Warning FailedMount 61s (x5 over 69s) kubelet, ix-truenas MountVolume.SetUp failed for volume "libzpool" : hostPath type check failed: /usr/lib/x86_64-linux-gnu/libzpool.so.2.0.0 is not a file

Looking at what we have for our libs, I got:
$ l /usr/lib/x86_64-linux-gnu | grep 'libzpool\|libzfs\|libuutil\|libnvpair'
lrwxrwxrwx 1 root root 18 Sep 14 19:01 libnvpair.so.3 -> libnvpair.so.3.0.0
-rwxr-xr-x 1 root root 408K Sep 14 19:01 libnvpair.so.3.0.0
lrwxrwxrwx 1 root root 17 Sep 14 19:01 libuutil.so.3 -> libuutil.so.3.0.0
-rwxr-xr-x 1 root root 242K Sep 14 19:01 libuutil.so.3.0.0
lrwxrwxrwx 1 root root 15 Sep 14 19:01 libzfs.so.4 -> libzfs.so.4.0.0
-rwxr-xr-x 1 root root 1.7M Sep 14 19:01 libzfs.so.4.0.0
lrwxrwxrwx 1 root root 20 Sep 14 19:01 libzfs_core.so.3 -> libzfs_core.so.3.0.0
-rwxr-xr-x 1 root root 493K Sep 14 19:01 libzfs_core.so.3.0.0
lrwxrwxrwx 1 root root 22 Sep 14 19:01 libzfsbootenv.so.1 -> libzfsbootenv.so.1.0.0
-rwxr-xr-x 1 root root 56K Sep 14 19:01 libzfsbootenv.so.1.0.0
lrwxrwxrwx 1 root root 17 Sep 14 19:01 libzpool.so.4 -> libzpool.so.4.0.0
-rwxr-xr-x 1 root root 17M Sep 14 19:01 libzpool.so.4.0.0

Clearly I've got the wrong versions!

sudo apt install libzfs2linux libzpool2linux
kubectl rollout restart daemonset/openebs-zfs-node -n kube-system

However then I get an infinite "pending" status when attempting to create a PV! (I waited a very long time and did numerous restarts of various kubernetes pods, etc)

# kdp
Name: blah-6c8cd954c8-cttbt
Namespace: default
Priority: 0
Node: ix-truenas/192.168.2.58
Start Time: Fri, 01 Jan 2021 14:23:18 -0800
Labels: app=blah
pod-template-hash=6c8cd954c8
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/blah-6c8cd954c8
Containers:
... unimportant stuff here ...
Mounts:
/data from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-rgp55 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: blah-datadir
ReadOnly: false
default-token-rgp55:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-rgp55
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m22s 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 4m22s 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 4m19s Successfully assigned default/blah-6c8cd954c8-cttbt to ix-truenas
Warning FailedMount 2m17s kubelet, ix-truenas Unable to attach or mount volumes: unmounted volumes=[datadir], unattached volumes=[datadir default-token-rgp55]: timed out waiting for the condition
Warning FailedMount 10s (x10 over 4m20s) kubelet, ix-truenas MountVolume.SetUp failed for volume "pvc-b5e21c97-4ca9-4b04-a0c4-d2e8f0bedc19" : rpc error: code = Internal desc = rpc error: code = Internal desc = verifyMount: volume is not ready to be mounted
Warning FailedMount 1s kubelet, ix-truenas Unable to attach or mount volumes: unmounted volumes=[datadir], unattached volumes=[default-token-rgp55 datadir]: timed out waiting for the condition

# k get zv -n openebs
NAME ZPOOL NODE SIZE STATUS FILESYSTEM AGE
pvc-b5e21c97-4ca9-4b04-a0c4-d2e8f0bedc19 main/ix-applications/default_volumes ix-truenas 21474836480 Pending zfs 2m19s
So instead, I tell the system to use the newer versions of the libs that came with the system. I edit
/mnt/main/ix-applications/k3s/server/manifests/zfs-operator.yaml
and (around line 1600) I replace all references to the earlier lib versions with the later ones. After a restart of the service, I can successfully make PVCs and get PersistentVolumes assigned to my deployments.

Unfortunately, the
/mnt/main/ix-applications/k3s/server/manifests/zfs-operator.yaml
file is reconstructed at every boot!

Can anyone help me with a better fix for this? Is there something else I should be upgrading or doing to make this work?
Also, what portion of the system is responsible for re-creating the zfs-operator.yaml file? Why does that even happen?

Thanks for your time and help!
 
Top