Unable to import zpool, data is still intact when checked with readonly

pk7909

Cadet
Joined
Jul 11, 2022
Messages
5
I need help badly been struggling for few weeks trying to figure out what went wrong.

When I try to import I get following message and the system just hangs.

WARNING: Pool 'main-pool' has encountered an uncorrectable I/O failure and has been suspended.

After researching I found I can use 'zpool import -f main-pool -o readonly=on' to see. In the GUI I can see the snapshots are still intact but I get the following results I try 'zpool status -v'


1657533973814.png


/proc/spl/kstat/zfs/dbgmsg:
Code:
timestamp    message
1657532534   spa.c:8356:spa_async_request(): spa=$import async request task=2048
1657532534   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED
1657532534   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1657532534   spa.c:6096:spa_import(): spa_import: importing main-pool
1657532534   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config trusted): LOADING
1657532534   vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/sde2': best uberblock found for spa main-pool. txg 2797077
1657532534   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config untrusted): using uberblock with txg=2797077
1657532534   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 812649440180795162: path changed from '/dev/gptid/b710b131-b6c6-4d51-8396-033aceac77e1' to '/dev/sde2'
1657532534   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 15431760524353817270: path changed from '/dev/gptid/ad246f60-dfa8-47e2-a8dd-499fa7e26fbe' to '/dev/sda2'
1657532534   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 408973825519044983: path changed from '/dev/ada5p2' to '/dev/sdb2'
1657532534   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 5947065650119764947: path changed from '/dev/gptid/96e51a41-d652-4fa3-8875-b610340cd31c' to '/dev/sdf2'
1657532534   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config trusted): spa_load_verify found 0 metadata errors and 6 data errors
1657532534   spa.c:8356:spa_async_request(): spa=main-pool async request task=2048
1657532534   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config trusted): LOADED
1657532534   spa.c:8356:spa_async_request(): spa=main-pool async request task=32
1657532951   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config trusted): UNLOADING
1657533074   spa.c:6240:spa_tryimport(): spa_tryimport: importing tank
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING
1657533074   vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/168e1285-00e1-11ed-8158-3cfdfe6bc5f0': best uberblock found for spa $import. txg 918
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=918
1657533074   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 11779842602659169114: path changed from '/dev/gptid/169657af-00e1-11ed-8158-3cfdfe6bc5f0' to '/dev/disk/by-partuuid/169657af-0
1657533074   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 5281776993533731353: path changed from '/dev/gptid/168e1285-00e1-11ed-8158-3cfdfe6bc5f0' to '/dev/disk/by-partuuid/168e1285-00
1657533074   spa.c:8356:spa_async_request(): spa=$import async request task=2048
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1657533074   spa.c:6240:spa_tryimport(): spa_tryimport: importing main-pool
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING
1657533074   vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/b710b131-b6c6-4d51-8396-033aceac77e1': best uberblock found for spa $import. txg 2797077
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=2797077
1657533074   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 812649440180795162: path changed from '/dev/gptid/b710b131-b6c6-4d51-8396-033aceac77e1' to '/dev/disk/by-partuuid/b710b131-b6c
1657533074   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 15431760524353817270: path changed from '/dev/gptid/ad246f60-dfa8-47e2-a8dd-499fa7e26fbe' to '/dev/disk/by-partuuid/ad246f60-d
1657533074   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 408973825519044983: path changed from '/dev/ada5p2' to '/dev/disk/by-partuuid/0fe8ae3b-ed23-4fbc-a98d-24f5a6a389ce'
1657533074   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 5947065650119764947: path changed from '/dev/gptid/96e51a41-d652-4fa3-8875-b610340cd31c' to '/dev/disk/by-partuuid/96e51a41-d6
1657533074   spa.c:8356:spa_async_request(): spa=$import async request task=2048
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED
1657533074   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1657533081   spa.c:6240:spa_tryimport(): spa_tryimport: importing tank
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING
1657533081   vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/169657af-00e1-11ed-8158-3cfdfe6bc5f0': best uberblock found for spa $import. txg 918
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=918
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 11779842602659169114: path changed from '/dev/gptid/169657af-00e1-11ed-8158-3cfdfe6bc5f0' to '/dev/disk/by-partuuid/169657af-0
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 5281776993533731353: path changed from '/dev/gptid/168e1285-00e1-11ed-8158-3cfdfe6bc5f0' to '/dev/disk/by-partuuid/168e1285-00
1657533081   spa.c:8356:spa_async_request(): spa=$import async request task=2048
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1657533081   spa.c:6240:spa_tryimport(): spa_tryimport: importing main-pool
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING
1657533081   vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/b710b131-b6c6-4d51-8396-033aceac77e1': best uberblock found for spa $import. txg 2797077
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=2797077
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 812649440180795162: path changed from '/dev/gptid/b710b131-b6c6-4d51-8396-033aceac77e1' to '/dev/disk/by-partuuid/b710b131-b6c
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 15431760524353817270: path changed from '/dev/gptid/ad246f60-dfa8-47e2-a8dd-499fa7e26fbe' to '/dev/disk/by-partuuid/ad246f60-d
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 408973825519044983: path changed from '/dev/ada5p2' to '/dev/disk/by-partuuid/0fe8ae3b-ed23-4fbc-a98d-24f5a6a389ce'
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 5947065650119764947: path changed from '/dev/gptid/96e51a41-d652-4fa3-8875-b610340cd31c' to '/dev/disk/by-partuuid/96e51a41-d6
1657533081   spa.c:8356:spa_async_request(): spa=$import async request task=2048
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED
1657533081   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1657533081   spa.c:6096:spa_import(): spa_import: importing main-pool
1657533081   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config trusted): LOADING
1657533081   vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/b710b131-b6c6-4d51-8396-033aceac77e1': best uberblock found for spa main-pool. txg 2797077
1657533081   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config untrusted): using uberblock with txg=2797077
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 812649440180795162: path changed from '/dev/gptid/b710b131-b6c6-4d51-8396-033aceac77e1' to '/dev/disk/by-partuuid/b710b131-b6c
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 15431760524353817270: path changed from '/dev/gptid/ad246f60-dfa8-47e2-a8dd-499fa7e26fbe' to '/dev/disk/by-partuuid/ad246f60-d
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 408973825519044983: path changed from '/dev/ada5p2' to '/dev/disk/by-partuuid/0fe8ae3b-ed23-4fbc-a98d-24f5a6a389ce'
1657533081   vdev.c:2375:vdev_copy_path_impl(): vdev_copy_path: vdev 5947065650119764947: path changed from '/dev/gptid/96e51a41-d652-4fa3-8875-b610340cd31c' to '/dev/disk/by-partuuid/96e51a41-d6
1657533081   spa_misc.c:418:spa_load_note(): spa_load(main-pool, config trusted): read 20 log space maps (20 total blocks - blksz = 131072 bytes) in 1 ms
1657533081   mmp.c:240:mmp_thread_start(): MMP thread started pool 'main-pool' gethrtime 970889996610
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 0, ms_id 13, smp_length 72968, unflushed_allocs 1564672, unflushed_frees 421888, freed 0, def099042816, max size error 17098960896, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 1, ms_id 5, smp_length 171008, unflushed_allocs 0, unflushed_frees 86016, freed 0, defer 0 + 84, max size error 13497503744, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 0, ms_id 14, smp_length 172560, unflushed_allocs 1388544, unflushed_frees 405504, freed 0, de6831516672, max size error 16831438848, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 1, ms_id 13, smp_length 257008, unflushed_allocs 1323008, unflushed_frees 53248, freed 0, def281754624, max size error 10281738240, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 0, ms_id 15, smp_length 192496, unflushed_allocs 36864, unflushed_frees 118784, freed 0, defe70699264, max size error 16770674688, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 0, ms_id 108, smp_length 379680, unflushed_allocs 90112, unflushed_frees 122880, freed 0, def3560823808, max size error 13560766464, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 1, ms_id 16, smp_length 105552, unflushed_allocs 24576, unflushed_frees 135168, freed 0, defe14406912, max size error 17114345472, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 0, ms_id 109, smp_length 127152, unflushed_allocs 159744, unflushed_frees 36864, freed 0, def939684352, max size error 11939667968, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 1, ms_id 67, smp_length 89648, unflushed_allocs 1208320, unflushed_frees 98304, freed 0, defe03343616, max size error 17103282176, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 0, ms_id 113, smp_length 385056, unflushed_allocs 413696, unflushed_frees 106496, freed 0, de3756493824, max size error 13756428288, old_weight 840000000000001, new_weight 840000000000001
1657533081   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 2797078, spa main-pool, vdev_id 1, ms_id 68, smp_length 105672, unflushed_allocs 1269760, unflushed_frees 143360, freed 0, de6615645184, max size error 16615571456, old_weight 840000000000001, new_weight 840000000000001


iostat -h 3 when zfs import -f main-pool hangs:
Code:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.0%    0.0%    0.0%    6.2%    0.0%   93.7%

      tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k loop0
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k nvme0n1
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k nvme1n1
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k sda
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k sdb
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k sdc
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k sdd
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k sde
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k sdf
     0.00         0.0k         0.0k         0.0k       0.0k       0.0k       0.0k sdg


I tried various methods after reading forums but nothing is working.

Should I just accept their is no way of recovering or can it still be salvaged ?
 

pk7909

Cadet
Joined
Jul 11, 2022
Messages
5
dmesg when it hangs
Code:
[  972.792483] WARNING: Pool 'main-pool' has encountered an uncorrectable I/O failure and has been suspended.

[ 1211.168675] INFO: task middlewared (wo:2755 blocked for more than 120 seconds.
[ 1211.176212]       Tainted: P           OE     5.10.120+truenas #1
[ 1211.182560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1211.190749] task:middlewared (wo state:D stack:    0 pid: 2755 ppid:  2660 flags:0x00004002
[ 1211.199450] Call Trace:
[ 1211.201973]  __schedule+0x282/0x870
[ 1211.205651]  schedule+0x46/0xb0
[ 1211.208922]  io_schedule+0x42/0x70
[ 1211.212486]  cv_wait_common+0xac/0x130 [spl]
[ 1211.216947]  ? add_wait_queue_exclusive+0x70/0x70
[ 1211.221897]  txg_wait_synced_impl+0xc9/0x110 [zfs]
[ 1211.226924]  txg_wait_synced+0xc/0x40 [zfs]
[ 1211.231338]  spa_load+0x14dc/0x1760 [zfs]
[ 1211.235569]  spa_load_best+0x54/0x2d0 [zfs]
[ 1211.239961]  spa_import+0x1e9/0x690 [zfs]
[ 1211.244211]  zfs_ioc_pool_import+0x12f/0x150 [zfs]
[ 1211.249192]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1211.254113]  ? __kmalloc_node+0x22d/0x2b0
[ 1211.258225]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1211.262259]  __x64_sys_ioctl+0x83/0xb0
[ 1211.266016]  do_syscall_64+0x33/0x80
[ 1211.269620]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1211.274762] RIP: 0033:0x7fea68cbdcc7
[ 1211.278391] RSP: 002b:00007ffd9dd41c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1211.286059] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fea68cbdcc7
[ 1211.293265] RDX: 00007ffd9dd41d10 RSI: 0000000000005a02 RDI: 0000000000000018
[ 1211.300515] RBP: 00007ffd9dd45c00 R08: 0000000000000002 R09: 00007fea68d87be0
[ 1211.307741] R10: 0000000020000000 R11: 0000000000000246 R12: 0000000004941d30
[ 1211.314920] R13: 00007ffd9dd41d10 R14: 0000000002d98d18 R15: 00007fea5a0e7ca0
[ 1211.322187] INFO: task middlewared (wo:5331 blocked for more than 120 seconds.
[ 1211.329494]       Tainted: P           OE     5.10.120+truenas #1
[ 1211.335639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1211.343620] task:middlewared (wo state:D stack:    0 pid: 5331 ppid:  2660 flags:0x00000000
[ 1211.352077] Call Trace:
[ 1211.354624]  __schedule+0x282/0x870
[ 1211.358132]  ? __kmalloc_node+0x141/0x2b0
[ 1211.362160]  schedule+0x46/0xb0
[ 1211.365323]  schedule_preempt_disabled+0xa/0x10
[ 1211.369899]  __mutex_lock.constprop.0+0x133/0x460
[ 1211.374660]  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
[ 1211.379951]  spa_all_configs+0x41/0x120 [zfs]
[ 1211.384366]  zfs_ioc_pool_configs+0x17/0x70 [zfs]
[ 1211.389147]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1211.393959]  ? __kmalloc_node+0x22d/0x2b0
[ 1211.398027]  ? recalibrate_cpu_khz+0x10/0x10
[ 1211.402357]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1211.406407]  __x64_sys_ioctl+0x83/0xb0
[ 1211.410210]  do_syscall_64+0x33/0x80
[ 1211.413830]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1211.418934] RIP: 0033:0x7fc7b69e0cc7
[ 1211.422601] RSP: 002b:00007ffc531e4948 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1211.430251] RAX: ffffffffffffffda RBX: 0000000004ea1a20 RCX: 00007fc7b69e0cc7
[ 1211.437432] RDX: 00007ffc531e4970 RSI: 0000000000005a04 RDI: 0000000000000018
[ 1211.444632] RBP: 00007ffc531e7f60 R08: 0000000004eca340 R09: 00007fc7b6aaabe0
[ 1211.444633] R10: 0000000000010000 R11: 0000000000000246 R12: 0000000004ea1a20
[ 1211.444633] R13: 0000000000000000 R14: 00007ffc531e4970 R15: 00007fc7b59bcc30
[ 1211.444649] INFO: task txg_sync:16865 blocked for more than 121 seconds.
[ 1211.444649]       Tainted: P           OE     5.10.120+truenas #1
[ 1211.444649] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1211.444650] task:txg_sync        state:D stack:    0 pid:16865 ppid:     2 flags:0x00004000
[ 1211.444650] Call Trace:
[ 1211.444651]  __schedule+0x282/0x870
[ 1211.444652]  schedule+0x46/0xb0
[ 1211.444653]  schedule_timeout+0x8b/0x140
[ 1211.444655]  ? __next_timer_interrupt+0x110/0x110
[ 1211.444657]  io_schedule_timeout+0x4c/0x80
[ 1211.444659]  __cv_timedwait_common+0x12b/0x160 [spl]
[ 1211.444660]  ? add_wait_queue_exclusive+0x70/0x70
[ 1211.444661]  __cv_timedwait_io+0x15/0x20 [spl]
[ 1211.444679]  zio_wait+0x129/0x2b0 [zfs]
[ 1211.444693]  dmu_buf_will_dirty_impl+0xab/0x160 [zfs]
[ 1211.444707]  dmu_write_impl+0x3f/0xd0 [zfs]
[ 1211.444723]  dmu_write.part.0+0xa5/0x140 [zfs]
[ 1211.444741]  space_map_write+0x14f/0x8b0 [zfs]
[ 1211.444757]  ? dnode_rele_and_unlock+0x5c/0xc0 [zfs]
[ 1211.444772]  ? dmu_object_size_from_db+0x22/0x80 [zfs]
[ 1211.444788]  ? space_map_open+0xfb/0x120 [zfs]
[ 1211.444807]  metaslab_flush+0xf4/0x360 [zfs]
[ 1211.444824]  ? spa_generate_syncing_log_sm+0x147/0x240 [zfs]
[ 1211.444840]  spa_flush_metaslabs+0x19c/0x3f0 [zfs]
[ 1211.444856]  spa_sync+0x5da/0xfa0 [zfs]
[ 1211.444873]  ? spa_txg_history_init_io+0x101/0x110 [zfs]
[ 1211.444888]  txg_sync_thread+0x2e0/0x4a0 [zfs]
[ 1211.444903]  ? txg_fini+0x250/0x250 [zfs]
[ 1211.444907]  thread_generic_wrapper+0x6f/0x80 [spl]
[ 1211.444910]  ? __thread_exit+0x20/0x20 [spl]
[ 1211.444913]  kthread+0x11b/0x140
[ 1211.444914]  ? __kthread_bind_mask+0x60/0x60
[ 1211.444916]  ret_from_fork+0x22/0x30
[ 1211.444919] INFO: task middlewared (wo:17500 blocked for more than 121 seconds.
[ 1211.444921]       Tainted: P           OE     5.10.120+truenas #1
[ 1211.444921] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1211.444921] task:middlewared (wo state:D stack:    0 pid:17500 ppid:  2660 flags:0x00000000
[ 1211.444922] Call Trace:
[ 1211.444924]  __schedule+0x282/0x870
[ 1211.444925]  ? __kmalloc_node+0x141/0x2b0
[ 1211.444927]  schedule+0x46/0xb0
[ 1211.444929]  schedule_preempt_disabled+0xa/0x10
[ 1211.444930]  __mutex_lock.constprop.0+0x133/0x460
[ 1211.444932]  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
[ 1211.444948]  spa_all_configs+0x41/0x120 [zfs]
[ 1211.444969]  zfs_ioc_pool_configs+0x17/0x70 [zfs]
[ 1211.444988]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1211.444990]  ? __kmalloc_node+0x22d/0x2b0
[ 1211.445008]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1211.445010]  __x64_sys_ioctl+0x83/0xb0
[ 1211.445011]  do_syscall_64+0x33/0x80
[ 1211.445013]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1211.445013] RIP: 0033:0x7efdc2440cc7
[ 1211.445013] RSP: 002b:00007ffffacc2268 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1211.445014] RAX: ffffffffffffffda RBX: 00000000043386c0 RCX: 00007efdc2440cc7
[ 1211.445015] RDX: 00007ffffacc2290 RSI: 0000000000005a04 RDI: 0000000000000018
[ 1211.445015] RBP: 00007ffffacc5880 R08: 00000000043799e0 R09: 00007efdc250abe0
[ 1211.445015] R10: 0000000000040030 R11: 0000000000000246 R12: 00000000043386c0
[ 1211.445016] R13: 0000000000000000 R14: 00007ffffacc2290 R15: 00007efdc141cc30
[ 1211.445018] INFO: task middlewared (wo:17721 blocked for more than 121 seconds.
[ 1211.445018]       Tainted: P           OE     5.10.120+truenas #1
[ 1211.445018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1211.445018] task:middlewared (wo state:D stack:    0 pid:17721 ppid:  2660 flags:0x00000000
[ 1211.445019] Call Trace:
[ 1211.445020]  __schedule+0x282/0x870
[ 1211.445022]  schedule+0x46/0xb0
[ 1211.445023]  schedule_preempt_disabled+0xa/0x10
[ 1211.445025]  __mutex_lock.constprop.0+0x133/0x460
[ 1211.445043]  spa_open_common+0x5e/0x450 [zfs]
[ 1211.445059]  spa_get_stats+0x54/0x460 [zfs]
[ 1211.445064]  ? __alloc_pages_nodemask+0x18f/0x340
[ 1211.823820]  ? __kmalloc_node+0x141/0x2b0
[ 1211.823822]  ? spl_kmem_alloc_impl+0xae/0xf0 [spl]
[ 1211.823841]  zfs_ioc_pool_stats+0x34/0x80 [zfs]
[ 1211.823858]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1211.823861]  ? __kmalloc_node+0x22d/0x2b0
[ 1211.846355]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1211.846356]  __x64_sys_ioctl+0x83/0xb0
[ 1211.846357]  do_syscall_64+0x33/0x80
[ 1211.846358]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1211.846358] RIP: 0033:0x7f21efa3bcc7
[ 1211.846359] RSP: 002b:00007ffc4bed5038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1211.846359] RAX: ffffffffffffffda RBX: 0000000001da3770 RCX: 00007f21efa3bcc7
[ 1211.846360] RDX: 00007ffc4bed5060 RSI: 0000000000005a05 RDI: 0000000000000018
[ 1211.846360] RBP: 00007ffc4bed8650 R08: 000000000391f680 R09: 00007f21efb05be0
[ 1211.846360] R10: 0000000000010030 R11: 0000000000000246 R12: 00007ffc4bed5060
[ 1211.846361] R13: 00000000038fa440 R14: 0000000000000000 R15: 00007ffc4bed8664
[ 1211.846362] INFO: task middlewared (wo:17805 blocked for more than 121 seconds.
[ 1211.846363]       Tainted: P           OE     5.10.120+truenas #1
[ 1211.846363] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1211.846365] task:middlewared (wo state:D stack:    0 pid:17805 ppid:  2660 flags:0x00000000
[ 1211.932047] Call Trace:
[ 1211.932048]  __schedule+0x282/0x870
[ 1211.932049]  schedule+0x46/0xb0
[ 1211.932050]  schedule_preempt_disabled+0xa/0x10
[ 1211.932050]  __mutex_lock.constprop.0+0x133/0x460
[ 1211.932068]  spa_open_common+0x5e/0x450 [zfs]
[ 1211.932083]  spa_get_stats+0x54/0x460 [zfs]
[ 1211.932086]  ? __alloc_pages_nodemask+0x18f/0x340
[ 1211.972647]  ? __kmalloc_node+0x141/0x2b0
[ 1211.972649]  ? spl_kmem_alloc_impl+0xae/0xf0 [spl]
[ 1211.972667]  zfs_ioc_pool_stats+0x34/0x80 [zfs]
[ 1211.972683]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1211.972686]  ? __kmalloc_node+0x22d/0x2b0
[ 1211.995114]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1211.995115]  __x64_sys_ioctl+0x83/0xb0
[ 1211.995116]  do_syscall_64+0x33/0x80
[ 1211.995117]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1211.995117] RIP: 0033:0x7f033d2c8cc7
[ 1211.995117] RSP: 002b:00007ffdeab53a48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1211.995118] RAX: ffffffffffffffda RBX: 0000000001fca5e0 RCX: 00007f033d2c8cc7
[ 1211.995118] RDX: 00007ffdeab53a70 RSI: 0000000000005a05 RDI: 0000000000000018
[ 1211.995118] RBP: 00007ffdeab57060 R08: 0000000003513350 R09: 00007f033d392be0
[ 1211.995119] R10: 0000000000010030 R11: 0000000000000246 R12: 00007ffdeab53a70
[ 1211.995119] R13: 00000000034ee4e0 R14: 0000000000000000 R15: 00007ffdeab57074
[ 1332.000895] INFO: task middlewared (wo:2755 blocked for more than 241 seconds.
[ 1332.008285]       Tainted: P           OE     5.10.120+truenas #1
[ 1332.014522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1332.022564] task:middlewared (wo state:D stack:    0 pid: 2755 ppid:  2660 flags:0x00004002
[ 1332.031108] Call Trace:
[ 1332.033656]  __schedule+0x282/0x870
[ 1332.037259]  schedule+0x46/0xb0
[ 1332.040527]  io_schedule+0x42/0x70
[ 1332.044009]  cv_wait_common+0xac/0x130 [spl]
[ 1332.048401]  ? add_wait_queue_exclusive+0x70/0x70
[ 1332.053259]  txg_wait_synced_impl+0xc9/0x110 [zfs]
[ 1332.058222]  txg_wait_synced+0xc/0x40 [zfs]
[ 1332.062574]  spa_load+0x14dc/0x1760 [zfs]
[ 1332.066711]  spa_load_best+0x54/0x2d0 [zfs]
[ 1332.071018]  spa_import+0x1e9/0x690 [zfs]
[ 1332.075160]  zfs_ioc_pool_import+0x12f/0x150 [zfs]
[ 1332.080092]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1332.085008]  ? __kmalloc_node+0x22d/0x2b0
[ 1332.089175]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1332.093346]  __x64_sys_ioctl+0x83/0xb0
[ 1332.097225]  do_syscall_64+0x33/0x80
[ 1332.100960]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1332.106179] RIP: 0033:0x7fea68cbdcc7
[ 1332.109860] RSP: 002b:00007ffd9dd41c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1332.117644] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fea68cbdcc7
[ 1332.124982] RDX: 00007ffd9dd41d10 RSI: 0000000000005a02 RDI: 0000000000000018
[ 1332.132328] RBP: 00007ffd9dd45c00 R08: 0000000000000002 R09: 00007fea68d87be0
[ 1332.139641] R10: 0000000020000000 R11: 0000000000000246 R12: 0000000004941d30
[ 1332.146946] R13: 00007ffd9dd41d10 R14: 0000000002d98d18 R15: 00007fea5a0e7ca0
[ 1332.154332] INFO: task middlewared (wo:5331 blocked for more than 241 seconds.
[ 1332.161770]       Tainted: P           OE     5.10.120+truenas #1
[ 1332.168032] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1332.175997] task:middlewared (wo state:D stack:    0 pid: 5331 ppid:  2660 flags:0x00000000
[ 1332.184516] Call Trace:
[ 1332.187027]  __schedule+0x282/0x870
[ 1332.190614]  ? __kmalloc_node+0x141/0x2b0
[ 1332.194749]  schedule+0x46/0xb0
[ 1332.197996]  schedule_preempt_disabled+0xa/0x10
[ 1332.202678]  __mutex_lock.constprop.0+0x133/0x460
[ 1332.207561]  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
[ 1332.212941]  spa_all_configs+0x41/0x120 [zfs]
[ 1332.217427]  zfs_ioc_pool_configs+0x17/0x70 [zfs]
[ 1332.222303]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1332.227224]  ? __kmalloc_node+0x22d/0x2b0
[ 1332.231349]  ? recalibrate_cpu_khz+0x10/0x10
[ 1332.235769]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1332.239851]  __x64_sys_ioctl+0x83/0xb0
[ 1332.243680]  do_syscall_64+0x33/0x80
[ 1332.247354]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1332.252549] RIP: 0033:0x7fc7b69e0cc7
[ 1332.256188] RSP: 002b:00007ffc531e4948 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1332.263969] RAX: ffffffffffffffda RBX: 0000000004ea1a20 RCX: 00007fc7b69e0cc7
[ 1332.271292] RDX: 00007ffc531e4970 RSI: 0000000000005a04 RDI: 0000000000000018
[ 1332.278571] RBP: 00007ffc531e7f60 R08: 0000000004eca340 R09: 00007fc7b6aaabe0
[ 1332.278571] R10: 0000000000010000 R11: 0000000000000246 R12: 0000000004ea1a20
[ 1332.278571] R13: 0000000000000000 R14: 00007ffc531e4970 R15: 00007fc7b59bcc30
[ 1332.278589] INFO: task middlewared (wo:17500 blocked for more than 241 seconds.
[ 1332.278589]       Tainted: P           OE     5.10.120+truenas #1
[ 1332.278590] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1332.278590] task:middlewared (wo state:D stack:    0 pid:17500 ppid:  2660 flags:0x00000000
[ 1332.278590] Call Trace:
[ 1332.278591]  __schedule+0x282/0x870
[ 1332.278592]  ? __kmalloc_node+0x141/0x2b0
[ 1332.278593]  schedule+0x46/0xb0
[ 1332.278593]  schedule_preempt_disabled+0xa/0x10
[ 1332.278594]  __mutex_lock.constprop.0+0x133/0x460
[ 1332.278596]  ? nvlist_xalloc.part.0+0x68/0xc0 [znvpair]
[ 1332.278612]  spa_all_configs+0x41/0x120 [zfs]
[ 1332.278628]  zfs_ioc_pool_configs+0x17/0x70 [zfs]
[ 1332.278643]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1332.278645]  ? __kmalloc_node+0x22d/0x2b0
[ 1332.278662]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1332.278663]  __x64_sys_ioctl+0x83/0xb0
[ 1332.278663]  do_syscall_64+0x33/0x80
[ 1332.278664]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1332.278665] RIP: 0033:0x7efdc2440cc7
[ 1332.278665] RSP: 002b:00007ffffacc2268 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1332.278665] RAX: ffffffffffffffda RBX: 00000000043386c0 RCX: 00007efdc2440cc7
[ 1332.278666] RDX: 00007ffffacc2290 RSI: 0000000000005a04 RDI: 0000000000000018
[ 1332.278666] RBP: 00007ffffacc5880 R08: 00000000043799e0 R09: 00007efdc250abe0
[ 1332.278668] R10: 0000000000040030 R11: 0000000000000246 R12: 00000000043386c0
[ 1332.278668] R13: 0000000000000000 R14: 00007ffffacc2290 R15: 00007efdc141cc30
[ 1332.278669] INFO: task middlewared (wo:17721 blocked for more than 241 seconds.
[ 1332.278669]       Tainted: P           OE     5.10.120+truenas #1
[ 1332.278669] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1332.278669] task:middlewared (wo state:D stack:    0 pid:17721 ppid:  2660 flags:0x00000000
[ 1332.278670] Call Trace:
[ 1332.278671]  __schedule+0x282/0x870
[ 1332.278671]  schedule+0x46/0xb0
[ 1332.278672]  schedule_preempt_disabled+0xa/0x10
[ 1332.278673]  __mutex_lock.constprop.0+0x133/0x460
[ 1332.278689]  spa_open_common+0x5e/0x450 [zfs]
[ 1332.278703]  spa_get_stats+0x54/0x460 [zfs]
[ 1332.278705]  ? __alloc_pages_nodemask+0x18f/0x340
[ 1332.278706]  ? __kmalloc_node+0x141/0x2b0
[ 1332.278707]  ? spl_kmem_alloc_impl+0xae/0xf0 [spl]
[ 1332.278723]  zfs_ioc_pool_stats+0x34/0x80 [zfs]
[ 1332.278738]  zfsdev_ioctl_common+0x6bc/0x8e0 [zfs]
[ 1332.278739]  ? __kmalloc_node+0x22d/0x2b0
[ 1332.278756]  zfsdev_ioctl+0x53/0xe0 [zfs]
[ 1332.278757]  __x64_sys_ioctl+0x83/0xb0
[ 1332.278758]  do_syscall_64+0x33/0x80
[ 1332.278759]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1332.278760] RIP: 0033:0x7f21efa3bcc7
[ 1332.278760] RSP: 002b:00007ffc4bed5038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1332.278760] RAX: ffffffffffffffda RBX: 0000000001da3770 RCX: 00007f21efa3bcc7
[ 1332.278761] RDX: 00007ffc4bed5060 RSI: 0000000000005a05 RDI: 0000000000000018
[ 1332.278761] RBP: 00007ffc4bed8650 R08: 000000000391f680 R09: 00007f21efb05be0
[ 1332.278761] R10: 0000000000010030 R11: 0000000000000246 R12: 00007ffc4bed5060
[ 1332.278761] R13: 00000000038fa440 R14: 0000000000000000 R15: 00007ffc4bed8664


zpool events -vf when it hangs
Code:
Jul 11 2022 04:42:14.304262384 sysevent.fs.zfs.pool_import

        version = 0x0

        class = "sysevent.fs.zfs.pool_import"

        pool = "main-pool"

        pool_guid = 0xc7e79c3e06f95da7

        pool_state = 0x0

        pool_context = 0x0

        time = 0x62cbf076 0x1222acf0

        eid = 0x6



Jul 11 2022 04:49:11.946917318 sysevent.fs.zfs.pool_export

        version = 0x0

        class = "sysevent.fs.zfs.pool_export"

        pool = "main-pool"

        pool_guid = 0xc7e79c3e06f95da7

        pool_state = 0x0

        pool_context = 0x0

        time = 0x62cbf217 0x3870cfc6

        eid = 0x7



Jul 11 2022 04:49:11.954917237 sysevent.fs.zfs.config_sync

        version = 0x0

        class = "sysevent.fs.zfs.config_sync"

        pool = "main-pool"

        pool_guid = 0xc7e79c3e06f95da7

        pool_state = 0x5

        pool_context = 0x0

        time = 0x62cbf217 0x38eae175

        eid = 0x8



Jul 11 2022 04:51:21.709615018 ereport.fs.zfs.data

        class = "ereport.fs.zfs.data"

        ena = 0xe21036ed7300801

        detector = (embedded nvlist)

                version = 0x0

                scheme = "zfs"

                pool = 0xc7e79c3e06f95da7

        (end detector)

        pool = "main-pool"

        pool_guid = 0xc7e79c3e06f95da7

        pool_state = 0x0

        pool_context = 0x2

        pool_failmode = "continue"

        zio_err = 0x34

        zio_flags = 0x808801

        zio_stage = 0x1000000

        zio_pipeline = 0x1080000

        zio_delay = 0x0

        zio_timestamp = 0x0

        zio_delta = 0x0

        zio_priority = 0x0

        zio_objset = 0x0

        zio_object = 0x555

        zio_level = 0x0

        zio_blkid = 0x1

        time = 0x62cbf299 0x2a4bddaa

        eid = 0x9



Jul 11 2022 04:51:21.717614938 ereport.fs.zfs.io_failure

        class = "ereport.fs.zfs.io_failure"

        ena = 0xe21036ed7300801

        detector = (embedded nvlist)

                version = 0x0

                scheme = "zfs"

                pool = 0xc7e79c3e06f95da7

        (end detector)

        pool = "main-pool"

        pool_guid = 0xc7e79c3e06f95da7

        pool_state = 0x0

        pool_context = 0x2

        pool_failmode = "continue"

        time = 0x62cbf299 0x2ac5ef5a

        eid = 0xa[
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Can you show smartctl -a /dev/sdb2 and smartctl -a /dev/sdf2? This may indicate whether these disks have SMART errors, which can generate the "too many errors" VDEV member state.

Unfortunately, things may be too far gone to save this pool, which also appears to be your system dataset pool. The hang is occurring because the system can't open various corrupted files for writing. Try moving your system dataset to another pool. This may allow you to import your pool read-write.

Unfortunately, both members of VDEV mirror-1 are degraded, which is one of the drawbacks of a 2-way stripe of 2-way mirrors. If you have backups, you can try recreating this pool as a RAIDZ2, which will survive the loss of any 2 disks.
 

pk7909

Cadet
Joined
Jul 11, 2022
Messages
5
Can you show smartctl -a /dev/sdb2 and smartctl -a /dev/sdf2? This may indicate whether these disks have SMART errors, which can generate the "too many errors" VDEV member state.

Unfortunately, things may be too far gone to save this pool, which also appears to be your system dataset pool. The hang is occurring because the system can't open various corrupted files for writing. Try moving your system dataset to another pool. This may allow you to import your pool read-write.

Unfortunately, both members of VDEV mirror-1 are degraded, which is one of the drawbacks of a 2-way stripe of 2-way mirrors. If you have backups, you can try recreating this pool as a RAIDZ2, which will survive the loss of any 2 disks.
smartctl -a /dev/sdb2
Code:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.120+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 870 QVO 4TB
Serial Number:    S5VYNJ0RA05947K
LU WWN Device Id: 5 002538 f31a26c2d
Firmware Version: SVQ02B6Q
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available, deterministic, zeroed
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Jul 11 06:09:44 2022 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 320) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       5270
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       200
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       6
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   080   053   000    Old_age   Always       -       20
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       -       0
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       88
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       41602426442

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      5268         -
# 2  Extended offline    Completed without error       00%      5203         -
# 3  Short offline       Completed without error       00%      5174         -
# 4  Short offline       Completed without error       00%      5058         -
# 5  Short offline       Completed without error       00%      5019         -
# 6  Extended offline    Aborted by host               20%      4998         -
# 7  Short offline       Completed without error       00%      4972         -
# 8  Extended offline    Completed without error       00%      4958         -
# 9  Short offline       Completed without error       00%      4925         -
#10  Short offline       Completed without error       00%      4900         -
#11  Short offline       Completed without error       00%      4877         -
#12  Short offline       Completed without error       00%      4853         -
#13  Short offline       Completed without error       00%      4830         -
#14  Short offline       Completed without error       00%      4829         -
#15  Short offline       Completed without error       00%      4805         -
#16  Short offline       Completed without error       00%      4781         -
#17  Short offline       Completed without error       00%      4757         -
#18  Short offline       Completed without error       00%      4733         -
#19  Short offline       Completed without error       00%      4709         -
#20  Short offline       Completed without error       00%      4687         -
#21  Extended offline    Completed without error       00%      4669         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  256        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



smartctl -a /dev/sdf2
Code:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.120+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 870 QVO 4TB
Serial Number:    S5VYNG0N700136Y
LU WWN Device Id: 5 002538 f7070747f
Firmware Version: SVQ02B6Q
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available, deterministic, zeroed
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Jul 11 06:10:48 2022 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 320) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       14034
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       232
177 Wear_Leveling_Count     0x0013   098   098   000    Pre-fail  Always       -       19
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   081   045   000    Old_age   Always       -       19
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       -       0
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       109
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       94958918288

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     14032         -
# 2  Extended offline    Completed without error       00%     13967         -
# 3  Short offline       Completed without error       00%     13938         -
# 4  Short offline       Completed without error       00%     13822         -
# 5  Short offline       Completed without error       00%     13783         -
# 6  Extended offline    Aborted by host               20%     13762         -
# 7  Short offline       Completed without error       00%     13736         -
# 8  Extended offline    Completed without error       00%     13722         -
# 9  Short offline       Completed without error       00%     13689         -
#10  Short offline       Completed without error       00%     13664         -
#11  Short offline       Completed without error       00%     13641         -
#12  Short offline       Completed without error       00%     13617         -
#13  Short offline       Completed without error       00%     13594         -
#14  Short offline       Completed without error       00%     13592         -
#15  Short offline       Completed without error       00%     13568         -
#16  Short offline       Completed without error       00%     13544         -
#17  Short offline       Completed without error       00%     13520         -
#18  Short offline       Completed without error       00%     13496         -
#19  Short offline       Completed without error       00%     13473         -
#20  Short offline       Completed without error       00%     13451         -
#21  Extended offline    Completed without error       00%     13433         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  256        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Above are the 2 outputs from smartctl.

My system dataset is already in a different pool. Should I create another pool and change it to that ?
1657538866074.png


Unfortunately, I lost my backup’s.

My biggest concern is losing my personal data I had in my ZVOL which I used for iSCSI share. I was able backup the ZVOL but I’m not sure where to go from now? Do I create a new zpool and transfer the ZVOL over, but would TrueNAS GUI pick the ZVOL?

Picture of the ZVOL currently stored on my windows comp.
1657538930913.png


Thank you
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Since you already saved off the zvol image, this is just a RAW disk image file. The simplest recovery option is to convert that to a VHD disk image file, which Windows can mount natively, after which you can recover your personal data.

 

pk7909

Cadet
Joined
Jul 11, 2022
Messages
5
Thanks for that information however I have been having difficulties doing that. I initial converted the image to VHD and followed the directions per the link you posted. After I attached vhd to my drives via Disk Manager I noticed it was split into 2TB formatted and 1TB unformatted. I was confused, I did some research and read into VHD formats and learned VHD has a limitation of 2TB.

Afterwards, I download qemu-img for windows from https://cloudbase.it/qemu-img-windows/ and converted the raw to VHDX. Even then, when I attach the VHDX to my drives it would show up as 2TB formatted and 1TB unformatted. Note, It would attach however the drive would not open or allow me to even assign a letter. I used the following command to convert to VHDX
Code:
.\qemu-img.exe convert F:\iscsi -O vhdx -o subformat=dynamic F:\dest.vhdx


Am I doing something wrong ?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Did you do the conversion on a copy? Otherwise, your backup may now be corrupted.
 

pk7909

Cadet
Joined
Jul 11, 2022
Messages
5
Did you do the conversion on a copy? Otherwise, your backup may now be corrupted.
Nope, I simply saved it.

Is their way I can attach the Zvol back to a zfs pool if I create a new one ?
 
Last edited:
Top