SOLVED openebs/zfs-driver removal in progress for 4 weeks

impovich

Explorer
Joined
May 12, 2021
Messages
72
Hi all, a few weeks ago I noticed spam in logs that a container removal is failed due to a missing dataset, could you suggest what I can do about it?

Code:
truenas# docker ps -a
CONTAINER ID   IMAGE                         COMMAND                  CREATED       STATUS                     PORTS     NAMES
d39f9d831680   openebs/zfs-driver            "/usr/local/bin/zfs-…"   4 weeks ago   Removal In Progress                  k8s_openebs-zfs-plugin_openebs-zfs-node-ds5sd_kube-system_d692ac39-ac25-4da6-adbb-93615d9cef3b_203
4f7053116229   rancher/pause:3.1             "/pause"                 4 weeks ago   Exited (255) 4 weeks ago             k8s_POD_openebs-zfs-node-ds5sd_kube-system_d692ac39-ac25-4da6-adbb-93615d9cef3b_27


Code:
truenas# k get pods -A
NAMESPACE          NAME                                    READY   STATUS        RESTARTS   AGE
kube-system        openebs-zfs-node-ds5sd                  0/2     Terminating   315        106d
kube-system        coredns-7448499f4d-hmsv9                1/1     Running       0          8h
kube-system        openebs-zfs-controller-0                5/5     Running       0          8h


logs:

Code:
Oct 27 01:36:10 truenas k3s[8629]: I1027 01:36:10.703743    8629 scope.go:111] "RemoveContainer" containerID="d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b"
Oct 27 01:36:10 truenas dockerd[8071]: time="2021-10-27T01:36:10.739172443+02:00" level=error msg="Error removing mounted layer d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b: exit status 1: \"/usr/sbin/zfs fs destroy -r storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24\" => cannot open 'storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24': dataset does not exist\n"
Oct 27 01:36:10 truenas dockerd[8071]: time="2021-10-27T01:36:10.739870694+02:00" level=error msg="Handler for DELETE /v1.41/containers/d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b returned error: container d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b: driver \"zfs\" failed to remove root filesystem: exit status 1: \"/usr/sbin/zfs fs destroy -r storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24\" => cannot open 'storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24': dataset does not exist\n"
Oct 27 01:36:10 truenas k3s[8629]: E1027 01:36:10.740889    8629 remote_runtime.go:296] "RemoveContainer from runtime service failed" err="rpc error: code = Unknown desc = failed to remove container \"d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b\": Error response from daemon: container d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b: driver \"zfs\" failed to remove root filesystem: exit status 1: \"/usr/sbin/zfs fs destroy -r storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24\" => cannot open 'storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24': dataset does not exist" containerID="d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b"
Oct 27 01:36:10 truenas k3s[8629]: E1027 01:36:10.741271    8629 kuberuntime_gc.go:146] "Failed to remove container" err="rpc error: code = Unknown desc = failed to remove container \"d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b\": Error response from daemon: container d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b: driver \"zfs\" failed to remove root filesystem: exit status 1: \"/usr/sbin/zfs fs destroy -r storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24\" => cannot open 'storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24': dataset does not exist" containerID="d39f9d83168090d640c8473d2311ec52be722e58951549aef32f187e59f90f8b"
 
Last edited:

impovich

Explorer
Joined
May 12, 2021
Messages
72
Found a workaround:

1) remove stuck pod
Code:
k3s kubectl delete -n kube-system pod/openebs-zfs-node-ds5sd --grace-period=0 --force 

2) create a dummy dataset
Code:
zfs create storage_404/ix-applications/docker/3114cdddca0eb01f1de848925eba4d027796b424b91f12d843673e174667bd24

3) remove container
Code:
docker rm -f d39f9d831680
 

impovich

Explorer
Joined
May 12, 2021
Messages
72
Maybe the best workaround would have been to create a dummy dataset in the first place so the issue would resolve itself, but unfortunately, I removed openebs-zfs-node before trying it
 

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
We're on-purpose necro'ing this thread, as we had users referencing this old work-around.

DO NOT DO THIS.
PVC's are designed to be managed by the manager, when you mix yourself "in-between" the manager and ZFS, you're only going to make maters worse in the long-run.

The correct workflow to fix this:
  1. Please DO NOT MANUAL CREATE datasets for apps

  2. After doing your due diligence by creating a bug report and attached a debug (Settings > Advanced > Save Debug)
    NOTE: This is out of our support scope and should be used at your own risk
  3. Try running these two commands:
    k3s kubectl get pods -n kube-system (and note the name of the zfs-node pod)
    k3s kubectl delete -n kube-system pod/openebs-zfs-controller-0 --force
    k3s kubectl delete -n kube-system pod/openebs-zfs-node-6pf9z --force (use the name of the zfs node pod, noticed above)
  4. Then reboot, This should bring up openebs after about 10-30 minutes.
 
Last edited:
Top