TrueNAS scale, SSH/ Web login hangs after, but can SSH into VM

Steasenburger

Explorer
Joined
Feb 12, 2020
Messages
52
Hi everyone,
I am having troubles connecting to my TrueNAS Scale system.
It took me a while to notice, because all my Docker containers which are running in a VM continued to function, but then I noticed that my cron jobs don't get executed any more.
I tried to check what is the problem and log into my TrueNAS UI (over the web and SSH) but both wasn't possible.
Then I decided to restart the whole system, to maybe fix the problems, and now I am in the situation where I can successfully SSH into my VM, but the TrueNAS SSH hangs after this:
root@192.168.0.97's password:
Last login: Sat Dec 30 05:05:11 2023 from 192.168.0.81

TrueNAS (c) 2009-2023, iXsystems, Inc.
All rights reserved.
TrueNAS code is released under the modified BSD license with some
files copyrighted by (c) iXsystems, Inc.

For more information, documentation, help or support, go here:
http://truenas.com

Welcome to FreeNAS
As you can see, the password gets accepted, and I see the first few messages, but then no prompt appears and my input is just ignored.
I also cannot use CTRL+C/D/ whatever to close the shell.
The Web UI doesn't load at all, which let me think that there is something wrong with the whole OS.

Unfortunately, I don't have direct physical access to the machine rn, so I cannot just hook up a display and keyboard to check what is going on.
However, I noticed that I can SCP files from the server to my local machine.
So I used this to copy the last syslog.

It is too long to copy the whole log after the last reboot, so i uploaded it here: https://pastebin.com/7L2GT5k7
I think the most interesting parts are those:
Dec 29 17:46:16 truenas kernel: task:txg_sync state:D stack: 0 pid: 583 ppid: 2 flags:0x00004000
Dec 29 17:46:16 truenas kernel: Call Trace:
Dec 29 17:46:16 truenas kernel: <TASK>
Dec 29 17:46:16 truenas kernel: __schedule+0x2f0/0x950
Dec 29 17:46:16 truenas kernel: ? kvm_set_msr_common+0x7f8/0xf50 [kvm]
Dec 29 17:46:16 truenas kernel: schedule+0x5b/0xd0
Dec 29 17:46:16 truenas kernel: schedule_timeout+0x88/0x140
Dec 29 17:46:16 truenas kernel: ? __bpf_trace_tick_stop+0x10/0x10
Dec 29 17:46:16 truenas kernel: io_schedule_timeout+0x4c/0x80
Dec 29 17:46:16 truenas kernel: __cv_timedwait_common+0x128/0x160 [spl]
Dec 29 17:46:16 truenas kernel: ? finish_wait+0x90/0x90
Dec 29 17:46:16 truenas kernel: __cv_timedwait_io+0x15/0x20 [spl]
Dec 29 17:46:16 truenas kernel: zio_wait+0x109/0x220 [zfs]
Dec 29 17:46:16 truenas kernel: dsl_pool_sync+0xb3/0x410 [zfs]
Dec 29 17:46:16 truenas kernel: spa_sync_iterate_to_convergence+0xdb/0x1e0 [zfs]
Dec 29 17:46:16 truenas kernel: spa_sync+0x2e9/0x5d0 [zfs]
Dec 29 17:46:16 truenas kernel: txg_sync_thread+0x229/0x2a0 [zfs]
Dec 29 17:46:16 truenas kernel: ? txg_dispatch_callbacks+0xf0/0xf0 [zfs]
Dec 29 17:46:16 truenas kernel: thread_generic_wrapper+0x59/0x70 [spl]
Dec 29 17:46:16 truenas kernel: ? __thread_exit+0x20/0x20 [spl]
Dec 29 17:46:16 truenas kernel: kthread+0x127/0x150
Dec 29 17:46:16 truenas kernel: ? set_kthread_struct+0x50/0x50
Dec 29 17:46:16 truenas kernel: ret_from_fork+0x22/0x30
Dec 29 17:46:16 truenas kernel: </TASK>
Dec 29 17:46:16 truenas kernel: task:asyncio_loop state:D stack: 0 pid: 1856 ppid: 1 flags:0x00000000
Dec 29 17:46:16 truenas kernel: Call Trace:
Dec 29 17:46:16 truenas kernel: <TASK>
Dec 29 17:46:16 truenas kernel: __schedule+0x2f0/0x950
Dec 29 17:46:16 truenas kernel: ? dbuf_read_impl.constprop.0+0x284/0x380 [zfs]
Dec 29 17:46:16 truenas kernel: schedule+0x5b/0xd0
Dec 29 17:46:16 truenas kernel: schedule_timeout+0x88/0x140
Dec 29 17:46:16 truenas kernel: ? __bpf_trace_tick_stop+0x10/0x10
Dec 29 17:46:16 truenas kernel: io_schedule_timeout+0x4c/0x80
Dec 29 17:46:16 truenas kernel: __cv_timedwait_common+0x128/0x160 [spl]
Dec 29 17:46:16 truenas kernel: ? finish_wait+0x90/0x90
Dec 29 17:46:16 truenas kernel: __cv_timedwait_io+0x15/0x20 [spl]
Dec 29 17:46:16 truenas kernel: zio_wait+0x109/0x220 [zfs]
Dec 29 17:46:16 truenas kernel: dmu_buf_hold+0x5f/0x90 [zfs]
Dec 29 17:46:16 truenas kernel: ? spl_kmem_cache_free+0xbc/0x120 [spl]
Dec 29 17:46:16 truenas kernel: zap_lockdir+0x4e/0xc0 [zfs]
Dec 29 17:46:16 truenas kernel: zap_lookup_norm+0x59/0xd0 [zfs]
Dec 29 17:46:16 truenas kernel: ? zfs_zget+0x11a/0x280 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_match_find.constprop.0+0x75/0x100 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_dirent_lock+0x427/0x5a0 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_dirlook+0x88/0x2b0 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_lookup+0x24d/0x400 [zfs]
Dec 29 17:46:16 truenas kernel: zpl_lookup+0xc6/0x260 [zfs]
Dec 29 17:46:16 truenas kernel: ? newidle_balance+0x127/0x400
Dec 29 17:46:16 truenas kernel: ? __cond_resched+0x16/0x50
Dec 29 17:46:16 truenas kernel: __lookup_slow+0x88/0x150
Dec 29 17:46:16 truenas kernel: walk_component+0x158/0x1d0
Dec 29 17:46:16 truenas kernel: link_path_walk.part.0+0x253/0x3c0
Dec 29 17:46:16 truenas kernel: ? path_init+0x2c0/0x3f0
Dec 29 17:46:16 truenas kernel: path_lookupat+0x43/0x1c0
Dec 29 17:46:16 truenas kernel: filename_lookup+0xcb/0x1d0
Dec 29 17:46:16 truenas kernel: ? __cond_resched+0x16/0x50
Dec 29 17:46:16 truenas kernel: ? aa_sk_perm+0x3e/0x1b0
Dec 29 17:46:16 truenas kernel: ? __check_object_size+0x146/0x160
Dec 29 17:46:16 truenas kernel: ? strncpy_from_user+0x3f/0x150
Dec 29 17:46:16 truenas kernel: ? getname_flags.part.0+0x45/0x1b0
Dec 29 17:46:16 truenas kernel: user_path_at_empty+0x3a/0x60
Dec 29 17:46:16 truenas kernel: vfs_statx+0x74/0x130
Dec 29 17:46:16 truenas kernel: __do_sys_newstat+0x39/0x70
Dec 29 17:46:16 truenas kernel: ? handle_mm_fault+0xcf/0x2b0
Dec 29 17:46:16 truenas kernel: ? do_user_addr_fault+0x1db/0x670
Dec 29 17:46:16 truenas kernel: ? exit_to_user_mode_prepare+0x3b/0x1f0
Dec 29 17:46:16 truenas kernel: do_syscall_64+0x3b/0xc0
Dec 29 17:46:16 truenas kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb
Dec 29 17:46:16 truenas kernel: RIP: 0033:0x7fddd0736d66
Dec 29 17:46:16 truenas kernel: RSP: 002b:00007ffc6b5fe428 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
Dec 29 17:46:16 truenas kernel: RAX: ffffffffffffffda RBX: 0000000001bc56c0 RCX: 00007fddd0736d66
Dec 29 17:46:16 truenas kernel: RDX: 00007ffc6b5fe4c0 RSI: 00007ffc6b5fe4c0 RDI: 00007fdd7c154600
Dec 29 17:46:16 truenas kernel: RBP: 00007fddd0534e00 R08: 0000000000000001 R09: 0000000000000000
Dec 29 17:46:16 truenas kernel: R10: 0000000000512108 R11: 0000000000000246 R12: 00007fddd05a8810
Dec 29 17:46:16 truenas kernel: R13: 00000000ffffff9c R14: 00007fddd0534e50 R15: 0000000001bc3b40
Dec 29 17:46:16 truenas kernel: </TASK>
Dec 29 17:46:16 truenas kernel: task:loop_monitor state:D stack: 0 pid: 1886 ppid: 1 flags:0x00000000
Dec 29 17:46:16 truenas kernel: Call Trace:
Dec 29 17:46:16 truenas kernel: <TASK>
Dec 29 17:46:16 truenas kernel: __schedule+0x2f0/0x950
Dec 29 17:46:16 truenas kernel: ? dbuf_read_impl.constprop.0+0x284/0x380 [zfs]
Dec 29 17:46:16 truenas kernel: schedule+0x5b/0xd0
Dec 29 17:46:16 truenas kernel: schedule_timeout+0x88/0x140
Dec 29 17:46:16 truenas kernel: ? __bpf_trace_tick_stop+0x10/0x10
Dec 29 17:46:16 truenas kernel: io_schedule_timeout+0x4c/0x80
Dec 29 17:46:16 truenas kernel: __cv_timedwait_common+0x128/0x160 [spl]
Dec 29 17:46:16 truenas kernel: ? finish_wait+0x90/0x90
Dec 29 17:46:16 truenas kernel: __cv_timedwait_io+0x15/0x20 [spl]
Dec 29 17:46:16 truenas kernel: zio_wait+0x109/0x220 [zfs]
Dec 29 17:46:16 truenas kernel: dmu_buf_hold_array_by_dnode+0x38f/0x530 [zfs]
Dec 29 17:46:16 truenas kernel: dmu_read_uio_dnode+0x47/0x100 [zfs]
Dec 29 17:46:16 truenas kernel: ? avl_add+0x46/0x90 [zavl]
Dec 29 17:46:16 truenas kernel: dmu_read_uio_dbuf+0x42/0x60 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_read+0x13a/0x3d0 [zfs]
Dec 29 17:46:16 truenas kernel: zpl_iter_read+0xa3/0x110 [zfs]
Dec 29 17:46:16 truenas kernel: new_sync_read+0x119/0x1b0
Dec 29 17:46:16 truenas kernel: vfs_read+0xf6/0x190
Dec 29 17:46:16 truenas kernel: ksys_read+0x5f/0xe0
Dec 29 17:46:16 truenas kernel: do_syscall_64+0x3b/0xc0
Dec 29 17:46:16 truenas kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb
Dec 29 17:46:16 truenas kernel: RIP: 0033:0x7fddd09cf08c
Dec 29 17:46:16 truenas kernel: RSP: 002b:00007fddc1ec6b70 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Dec 29 17:46:16 truenas kernel: RAX: ffffffffffffffda RBX: 0000000003517640 RCX: 00007fddd09cf08c
Dec 29 17:46:16 truenas kernel: RDX: 0000000000001400 RSI: 00007fddbc000fd0 RDI: 0000000000000031
Dec 29 17:46:16 truenas kernel: RBP: 00007fddc1ec9680 R08: 0000000000000000 R09: 0000000000000000
Dec 29 17:46:16 truenas kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000001400
Dec 29 17:46:16 truenas kernel: R13: 0000000000000031 R14: 00007fddbc000fd0 R15: 0000000000918b20
Dec 29 17:46:16 truenas kernel: </TASK>
Dec 29 17:46:16 truenas kernel: task:rpc.mountd state:D stack: 0 pid: 4595 ppid: 4590 flags:0x00000000
Dec 29 17:46:16 truenas kernel: Call Trace:
Dec 29 17:46:16 truenas kernel: <TASK>
Dec 29 17:46:16 truenas kernel: __schedule+0x2f0/0x950
Dec 29 17:46:16 truenas kernel: schedule+0x5b/0xd0
Dec 29 17:46:16 truenas kernel: rwsem_down_write_slowpath+0x197/0x4e0
Dec 29 17:46:16 truenas kernel: path_openat+0x35e/0x12d0
Dec 29 17:46:16 truenas kernel: ? refcount_dec_and_lock+0x12/0x90
Dec 29 17:46:16 truenas kernel: do_filp_open+0xa9/0x150
Dec 29 17:46:16 truenas kernel: ? __fput+0x101/0x250
Dec 29 17:46:16 truenas kernel: ? __check_object_size+0x146/0x160
Dec 29 17:46:16 truenas kernel: do_sys_openat2+0x9b/0x160
Dec 29 17:46:16 truenas kernel: __x64_sys_openat+0x54/0xa0
Dec 29 17:46:16 truenas kernel: do_syscall_64+0x3b/0xc0
Dec 29 17:46:16 truenas kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb
Dec 29 17:46:16 truenas kernel: RIP: 0033:0x7fa6b822f5a7
Dec 29 17:46:16 truenas kernel: RSP: 002b:00007ffdfe947740 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
Dec 29 17:46:16 truenas kernel: RAX: ffffffffffffffda RBX: 000055ca41d35560 RCX: 00007fa6b822f5a7
Dec 29 17:46:16 truenas kernel: RDX: 0000000000000042 RSI: 000055ca41d2d06a RDI: 00000000ffffff9c
Dec 29 17:46:16 truenas kernel: RBP: 000055ca41d2d06a R08: 0000000000000000 R09: 0031382e302e3836
Dec 29 17:46:16 truenas kernel: R10: 0000000000000180 R11: 0000000000000246 R12: 0000000000000042
Dec 29 17:46:16 truenas kernel: R13: 000055ca41d2d06a R14: 00007ffdfe948330 R15: 00007ffdfe949420
Dec 29 17:46:16 truenas kernel: </TASK>
Dec 29 17:46:16 truenas kernel: task:rpc.mountd state:D stack: 0 pid: 4599 ppid: 4590 flags:0x00000000
Dec 29 17:46:16 truenas kernel: Call Trace:
Dec 29 17:46:16 truenas kernel: <TASK>
Dec 29 17:46:16 truenas kernel: __schedule+0x2f0/0x950
Dec 29 17:46:16 truenas kernel: ? dbuf_read_impl.constprop.0+0x284/0x380 [zfs]
Dec 29 17:46:16 truenas kernel: schedule+0x5b/0xd0
Dec 29 17:46:16 truenas kernel: schedule_timeout+0x88/0x140
Dec 29 17:46:16 truenas kernel: ? __bpf_trace_tick_stop+0x10/0x10
Dec 29 17:46:16 truenas kernel: io_schedule_timeout+0x4c/0x80
Dec 29 17:46:16 truenas kernel: __cv_timedwait_common+0x128/0x160 [spl]
Dec 29 17:46:16 truenas kernel: ? finish_wait+0x90/0x90
Dec 29 17:46:16 truenas kernel: __cv_timedwait_io+0x15/0x20 [spl]
Dec 29 17:46:16 truenas kernel: zio_wait+0x109/0x220 [zfs]
Dec 29 17:46:16 truenas kernel: dnode_hold_impl+0x10e/0xb80 [zfs]
Dec 29 17:46:16 truenas kernel: ? mutex_lock+0xe/0x30
Dec 29 17:46:16 truenas kernel: ? zfs_znode_hold_enter+0x113/0x160 [zfs]
Dec 29 17:46:16 truenas kernel: ? zap_lookup_impl+0x84/0x2c0 [zfs]
Dec 29 17:46:16 truenas kernel: ? __cond_resched+0x16/0x50
Dec 29 17:46:16 truenas kernel: dmu_bonus_hold+0x33/0x90 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_zget+0x56/0x280 [zfs]
Dec 29 17:46:16 truenas kernel: ? zfs_match_find.constprop.0+0x75/0x100 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_dirent_lock+0x453/0x5a0 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_dirlook+0x88/0x2b0 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_lookup+0x24d/0x400 [zfs]
Dec 29 17:46:16 truenas kernel: zpl_lookup+0xc6/0x260 [zfs]
Dec 29 17:46:16 truenas kernel: ? __d_lookup+0x73/0x130
Dec 29 17:46:16 truenas kernel: path_openat+0x611/0x12d0
Dec 29 17:46:16 truenas kernel: do_filp_open+0xa9/0x150
Dec 29 17:46:16 truenas kernel: ? fcntl_setlk+0x14d/0x2d0
Dec 29 17:46:16 truenas kernel: ? __check_object_size+0x146/0x160
Dec 29 17:46:16 truenas kernel: do_sys_openat2+0x9b/0x160
Dec 29 17:46:16 truenas kernel: __x64_sys_openat+0x54/0xa0
Dec 29 17:46:16 truenas kernel: do_syscall_64+0x3b/0xc0
Dec 29 17:46:16 truenas kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb
Dec 29 17:46:16 truenas kernel: RIP: 0033:0x7fa6b822f5a7
Dec 29 17:46:16 truenas kernel: RSP: 002b:00007ffdfe9476a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
Dec 29 17:46:16 truenas kernel: RAX: ffffffffffffffda RBX: 000055ca42561940 RCX: 00007fa6b822f5a7
Dec 29 17:46:16 truenas kernel: RDX: 0000000000000002 RSI: 000055ca41d2db4f RDI: 00000000ffffff9c
Dec 29 17:46:16 truenas kernel: RBP: 000055ca41d2db4f R08: 0000000000000000 R09: 0000000000000001
Dec 29 17:46:16 truenas kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
Dec 29 17:46:16 truenas kernel: R13: 000055ca42561940 R14: 0000000000000001 R15: 000000000000000d
Dec 29 17:46:16 truenas kernel: </TASK>
Dec 29 17:46:16 truenas kernel: task:rpc.mountd state:D stack: 0 pid: 4601 ppid: 4590 flags:0x00000000
Dec 29 17:46:16 truenas kernel: Call Trace:
Dec 29 17:46:16 truenas kernel: <TASK>
Dec 29 17:46:16 truenas kernel: __schedule+0x2f0/0x950
Dec 29 17:46:16 truenas kernel: schedule+0x5b/0xd0
Dec 29 17:46:16 truenas kernel: rwsem_down_write_slowpath+0x197/0x4e0
Dec 29 17:46:16 truenas kernel: path_openat+0x35e/0x12d0
Dec 29 17:46:16 truenas kernel: ? refcount_dec_and_lock+0x12/0x90
Dec 29 17:46:16 truenas kernel: do_filp_open+0xa9/0x150
Dec 29 17:46:16 truenas kernel: ? __fput+0x101/0x250
Dec 29 17:46:16 truenas kernel: ? __check_object_size+0x146/0x160
Dec 29 17:46:16 truenas kernel: do_sys_openat2+0x9b/0x160
Dec 29 17:46:16 truenas kernel: __x64_sys_openat+0x54/0xa0
Dec 29 17:46:16 truenas kernel: do_syscall_64+0x3b/0xc0
Dec 29 17:46:16 truenas kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb
Dec 29 17:46:16 truenas kernel: RIP: 0033:0x7fa6b822f5a7
Dec 29 17:46:16 truenas kernel: RSP: 002b:00007ffdfe947740 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
Dec 29 17:46:16 truenas kernel: RAX: ffffffffffffffda RBX: 000055ca41d35560 RCX: 00007fa6b822f5a7
Dec 29 17:46:16 truenas kernel: RDX: 0000000000000042 RSI: 000055ca41d2d06a RDI: 00000000ffffff9c
Dec 29 17:46:16 truenas kernel: RBP: 000055ca41d2d06a R08: 0000000000000000 R09: 0031382e302e3836
Dec 29 17:46:16 truenas kernel: R10: 0000000000000180 R11: 0000000000000246 R12: 0000000000000042
Dec 29 17:46:16 truenas kernel: R13: 000055ca41d2d06a R14: 00007ffdfe948330 R15: 00007ffdfe949420
Dec 29 17:46:16 truenas kernel: </TASK>
Dec 29 17:46:16 truenas kernel: task:nginx state:D stack: 0 pid: 4865 ppid: 4862 flags:0x00000000
Dec 29 17:46:16 truenas kernel: Call Trace:
Dec 29 17:46:16 truenas kernel: <TASK>
Dec 29 17:46:16 truenas kernel: __schedule+0x2f0/0x950
Dec 29 17:46:16 truenas kernel: ? dbuf_read_impl.constprop.0+0x284/0x380 [zfs]
Dec 29 17:46:16 truenas kernel: schedule+0x5b/0xd0
Dec 29 17:46:16 truenas kernel: schedule_timeout+0x88/0x140
Dec 29 17:46:16 truenas kernel: ? __bpf_trace_tick_stop+0x10/0x10
Dec 29 17:46:16 truenas kernel: io_schedule_timeout+0x4c/0x80
Dec 29 17:46:16 truenas kernel: __cv_timedwait_common+0x128/0x160 [spl]
Dec 29 17:46:16 truenas kernel: ? finish_wait+0x90/0x90
Dec 29 17:46:16 truenas kernel: __cv_timedwait_io+0x15/0x20 [spl]
Dec 29 17:46:16 truenas kernel: zio_wait+0x109/0x220 [zfs]
Dec 29 17:46:16 truenas kernel: dnode_hold_impl+0x10e/0xb80 [zfs]
Dec 29 17:46:16 truenas kernel: ? mutex_lock+0xe/0x30
Dec 29 17:46:16 truenas kernel: ? zfs_znode_hold_enter+0x113/0x160 [zfs]
Dec 29 17:46:16 truenas kernel: ? zap_lookup_impl+0x84/0x2c0 [zfs]
Dec 29 17:46:16 truenas kernel: ? __cond_resched+0x16/0x50
Dec 29 17:46:16 truenas kernel: dmu_bonus_hold+0x33/0x90 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_zget+0x56/0x280 [zfs]
Dec 29 17:46:16 truenas kernel: ? zfs_match_find.constprop.0+0x75/0x100 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_dirent_lock+0x453/0x5a0 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_dirlook+0x88/0x2b0 [zfs]
Dec 29 17:46:16 truenas kernel: zfs_lookup+0x24d/0x400 [zfs]
Dec 29 17:46:16 truenas kernel: zpl_lookup+0xc6/0x260 [zfs]
Dec 29 17:46:16 truenas kernel: ? dmu_read_impl+0x11b/0x180 [zfs]
Dec 29 17:46:16 truenas kernel: __lookup_slow+0x88/0x150
Dec 29 17:46:16 truenas kernel: walk_component+0x158/0x1d0
Dec 29 17:46:16 truenas kernel: link_path_walk.part.0+0x253/0x3c0
Dec 29 17:46:16 truenas kernel: ? path_init+0x2c0/0x3f0
Dec 29 17:46:16 truenas kernel: path_lookupat+0x43/0x1c0
Dec 29 17:46:16 truenas kernel: filename_lookup+0xcb/0x1d0
Dec 29 17:46:16 truenas kernel: ? __check_object_size+0x146/0x160
Dec 29 17:46:16 truenas kernel: ? strncpy_from_user+0x3f/0x150
Dec 29 17:46:16 truenas kernel: ? getname_flags.part.0+0x45/0x1b0
Dec 29 17:46:16 truenas kernel: user_path_at_empty+0x3a/0x60
Dec 29 17:46:16 truenas kernel: vfs_statx+0x74/0x130
Dec 29 17:46:16 truenas kernel: __do_sys_newstat+0x39/0x70
Dec 29 17:46:16 truenas kernel: ? handle_mm_fault+0xcf/0x2b0
Dec 29 17:46:16 truenas kernel: ? do_user_addr_fault+0x1db/0x670
Dec 29 17:46:16 truenas kernel: ? exit_to_user_mode_prepare+0x3b/0x1f0
Dec 29 17:46:16 truenas kernel: do_syscall_64+0x3b/0xc0
Dec 29 17:46:16 truenas kernel: entry_SYSCALL_64_after_hwframe+0x61/0xcb
Dec 29 17:46:16 truenas kernel: RIP: 0033:0x7feb68bccd66
Dec 29 17:46:16 truenas kernel: RSP: 002b:00007ffe7c26fc18 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
Dec 29 17:46:16 truenas kernel: RAX: ffffffffffffffda RBX: 00007ffe7c26fe30 RCX: 00007feb68bccd66
Dec 29 17:46:16 truenas kernel: RDX: 00007ffe7c26fc90 RSI: 00007ffe7c26fc90 RDI: 000055a944a64238
Dec 29 17:46:16 truenas kernel: RBP: 00007ffe7c26fdb0 R08: 0000000000000001 R09: 00007ffe7c26fdb0
Dec 29 17:46:16 truenas kernel: R10: 000055a944a68bb8 R11: 0000000000000246 R12: 00007ffe7c26fc90
Dec 29 17:46:16 truenas kernel: R13: 000055a944a633c0 R14: 00007ffe7c26fe30 R15: 0000000000000003
Dec 29 17:46:16 truenas kernel: </TASK>
Dec 29 17:46:55 truenas kernel: sd 9:0:0:0: [sdc] tag#11 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=100s
Dec 29 17:46:55 truenas kernel: sd 9:0:0:0: [sdc] tag#11 Sense Key : Hardware Error [current]
Dec 29 17:46:55 truenas kernel: sd 9:0:0:0: [sdc] tag#11 ASC=0x44 <<vendor>>ASCQ=0x81
Dec 29 17:46:55 truenas kernel: sd 9:0:0:0: [sdc] tag#11 CDB: Read(10) 28 00 01 7b 76 f0 00 00 18 00
<more of those ...>
Dec 29 17:46:55 truenas kernel: zio pool=boot-pool vdev=/dev/sdc3 error=121 type=1 offset=12192702464 size=12288 flags=180880
Dec 29 17:46:55 truenas kernel: WARNING: Pool 'boot-pool' has encountered an uncorrectable I/O failure and has been suspended.
Dec 29 17:46:55 truenas kernel: WARNING: Pool 'boot-pool' has encountered an uncorrectable I/O failure and has been suspended.
<more of those...>
It looks like my boot device (an SSD in a M2-USB case, which I have replaced only a few months ago) has failed.
The problem is that I'm not quite sure any more which device belongs to which pool.
Is there maybe a file which I could copy to check this (since I also cannot execute commands via SSH (stuck as well))?

And why is my VM still working? Even tho the NFS shares from TrueNAS and especially the permissions seem to be somehow screwed, since everything is owned by root now.
What would be the right way to fix this problem?
 

Steasenburger

Explorer
Joined
Feb 12, 2020
Messages
52
Any ideas? I would really appreciate some help, since this makes our NAS and all the services that run on it unusable since a few days.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
With your boot pool effectively offline, you're lucky anything works at all...

No physical access and inability to use any commands in SSH (despite SSH copy somehow working) puts you in the SOL category, I suspect.

why is my VM still working?
If all the files needed for it are loaded into memory and/or in a data pool that isn't broken, it can still be getting what it needs... again, I'm surprised to see that the system hasn't crashed, but that's UNIXy behavior... just keep running and don't care.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
1. Do you have a config backup?
2. Please post your hardware as per forum rules
 

Steasenburger

Explorer
Joined
Feb 12, 2020
Messages
52
With your boot pool effectively offline, you're lucky anything works at all...

No physical access and inability to use any commands in SSH (despite SSH copy somehow working) puts you in the SOL category, I suspect.


If all the files needed for it are loaded into memory and/or in a data pool that isn't broken, it can still be getting what it needs... again, I'm surprised to see that the system hasn't crashed, but that's UNIXy behavior... just keep running and don't care.
Thanks for the explanation. I guesss SOL means "Shit, out of luck" right? ^^
Actually, in the meantime, the SSH access has further degraded.
I don't see the password prompt anymore, and the VM seems to be down as well now. I can still ping the machine.

And one of my family members offered me to do some "remote-on-site" debugging in the next days. Any advices what to check first and how to proceed?
1. Do you have a config backup?
2. Please post your hardware as per forum rules
I have a config file backup on my laptop, but this one is rather old (TrueNAS 13.0, before I migrated to SCALE) so I suspect this won't help a lot.
Probably there are other, more recent config files on another machine, but I don't really have access to this one right now.
Definitely learnt my lesson now, on how important it is to regularly create and back up those config files.

If the SSD is indeed dead, is it somehow possible to restore the boot configuration from the system dataset, maybe?
I mean, somehow the system was able to boot. In a more or less corrupt state and not for a long time, but still...

I can only roughly draw my hardware right now, but I think the most important part is this Intenso boot SSD alongside this Eluteng M.2 enclosure,
which I bought together this year in May.
I have some slight hope that only the M.2 enclosure died, and the SSD is still fine, but I would probably need another SSD enclosure to test this :/

Apart from that, the NAS consists of 4 2TB SSDs, one Kingston 256GB SSD where the system dataset is stored on, and an AMD Ryzen 5600 CPU.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Assuming you are correct about the boot-pool dieing - which looks on the evidence correct - you are not in a totally bad place.
  1. Replace the boot pool
  2. Reinstall TrueNAS and import the pool - you should still have all your data
  3. Recreate the VM, you still have the ZVOL - so that SHOULD be simple
  4. Recreate any shares - thats part of the config you don't have
  5. Install (and run on a regular basis) @joeschmuck 's multi-report script and configure to email you a disk report AND copy of the config every so often (I do every day). Multi-Report

Alternatively - and I am unsure of the exact details here (like specific location). You may be able, once you complete stage 2 above, and depending on where the system dataset was (your 256GB SSD), you may be able to recover a config file from the pool which you can then import. I think this depends on the system dataset location, which from the sounds of it - you still should have
 

Steasenburger

Explorer
Joined
Feb 12, 2020
Messages
52
Thanks for the detailed instructions about how to restore the system. Sounds doable indeed.
I just need to get a new device for my boot-pool (if the old SSD is indeed corrupt).
Are there any recommendations for boot-pool devices that are not too expensive (I'm not a business user)?

I had two USB sticks before, but one of them died rather quickly, and I thought an SSD + USB enclosure should work better.
But apparently it's not that simple.

If anybody can provide me some information about how to get the system configuration from the system dataset, I would be grateful :)

And as a last question, is it required/ recommended having the system dataset on a mirrored pool as well?
If I understand it correctly from the docs, the "only" thing that would get lost in case of a failure would be...
debugging core files, encryption keys for encrypted pools, and Samba4 metadata such as the user and group cache and share level permissions

Edit: And thanks for the about the multi-report script. I will definitely install it when everything is running again.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
An SSD with a USB bridge ought to be fine.
 

Steasenburger

Explorer
Joined
Feb 12, 2020
Messages
52
Okay, I see.
I will contact the manufacturer of the SSD /USB bridge and ask for a warranty claim.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Test it first on something else - with another drive
 
Top