Our Setup
Hardware
BACKGROUND
We have a box with 60 hard drives running FreeNAS 9.10.2 U1 that has been working fine for a long time. Over the last few months, we noticed the Active Directory connection would become out of sync, but reconnecting it fixed the issues. I have a backup of the config I took on Saturday when the server was working and accessible.
BEHAVIOR WE'RE EXPERIENCING
We were seeing source code in the status log in the gui. Rebooted the box and everything appeared fine. Shares were accessible, etc. Had somebody connect in to check our firmware / drivers and install in general, and the server stopped responding after querying for the firmware of the HBA cards. When we reboot, it would go to kernal panic.
STEPS TAKEN
KERNAL PANIC MESSAGE
Panic: solaris assert: offset + size <= sm->sm_start + sm->sm_size (0x64060634802000 <= 0x230000000000), file: /freenas-9.10-releng/_BE/os/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line 119
cpuid = 0
KBD: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe201cde9560
kdb_backtrace () at kbd_backtrace+0x39/frame 0xfffffe201cde9610
vpanic() at vpanic+ox126/frame 0xfffffe201cde9650
panic() at panic+0x43/frame 0xfffffe201cde96b0
assfail3() at assfail3+0x43/frame 0xfffffe201cde96d0
space_map_load() at space_map_load+0x372/frame 0xfffffe201cde9750
metaslab_load() at metaslab_load+0x2e/frame 0xfffffe201cde9770
metaslab_alloc() at metaslab_alloc+0x857/frame 0xfffffe201cde98c0
zio_dva_allocate() at zio_dva_allocate+0x85/frame 0xfffffe201cde9990
zio_execute() at zio_execute+0x111/frame 0xfffffe201cde99f0
taskqueue_run_locked() at taskqueue_run_locked+0xe5/frame 0xfffffe201cde9a40
taskqueue_thread_loop() at taskqueue_thread_loop+0xa8/frame 0xfffffe201cde9a70
fork_exit() at fork_exit+0x9a/frame 0xfffffe201cde9ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe201cde9ab0
--- trap 0, rip = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 101016 ]
Stopped at kdb_enter+0x3e: movq $0, kdb_why
db>
Hardware
- 45 Drives Turbo 60 XL
- 60 SATA drives (6 10 disk vdevs raidz2)
- 8 cache drives
- 2 ssds for config
- 1 flash drive with backup config and cache
- FreeNAS 9.10.2 U1
BACKGROUND
We have a box with 60 hard drives running FreeNAS 9.10.2 U1 that has been working fine for a long time. Over the last few months, we noticed the Active Directory connection would become out of sync, but reconnecting it fixed the issues. I have a backup of the config I took on Saturday when the server was working and accessible.
BEHAVIOR WE'RE EXPERIENCING
We were seeing source code in the status log in the gui. Rebooted the box and everything appeared fine. Shares were accessible, etc. Had somebody connect in to check our firmware / drivers and install in general, and the server stopped responding after querying for the firmware of the HBA cards. When we reboot, it would go to kernal panic.
STEPS TAKEN
- After trying to reboot, same symptoms persist
- Unplugged one of the SATA boot drives (to leave it intact)
- Installed new copy of FreeNAS using 9.10.2 U1 ISO
- Configured IP address and logged into the GUI
- Imported volume
- Applied backup config from Saturday
- System kernal panics
KERNAL PANIC MESSAGE
Panic: solaris assert: offset + size <= sm->sm_start + sm->sm_size (0x64060634802000 <= 0x230000000000), file: /freenas-9.10-releng/_BE/os/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line 119
cpuid = 0
KBD: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe201cde9560
kdb_backtrace () at kbd_backtrace+0x39/frame 0xfffffe201cde9610
vpanic() at vpanic+ox126/frame 0xfffffe201cde9650
panic() at panic+0x43/frame 0xfffffe201cde96b0
assfail3() at assfail3+0x43/frame 0xfffffe201cde96d0
space_map_load() at space_map_load+0x372/frame 0xfffffe201cde9750
metaslab_load() at metaslab_load+0x2e/frame 0xfffffe201cde9770
metaslab_alloc() at metaslab_alloc+0x857/frame 0xfffffe201cde98c0
zio_dva_allocate() at zio_dva_allocate+0x85/frame 0xfffffe201cde9990
zio_execute() at zio_execute+0x111/frame 0xfffffe201cde99f0
taskqueue_run_locked() at taskqueue_run_locked+0xe5/frame 0xfffffe201cde9a40
taskqueue_thread_loop() at taskqueue_thread_loop+0xa8/frame 0xfffffe201cde9a70
fork_exit() at fork_exit+0x9a/frame 0xfffffe201cde9ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe201cde9ab0
--- trap 0, rip = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 101016 ]
Stopped at kdb_enter+0x3e: movq $0, kdb_why
db>