I have a NFS server running well for the last year and a couple of days ago the server crashed and was stuck in a reboot cycle. Everytime the pool was being imported kernel panics and reboots.
Here are the details of my installation:
Server Details
Server : Supermicro SuperChassis 826BA-R1K28WB 2U Server W/ X9DRW-3F
Memory : 64GB
Storage Controller : ADAPTEC 71605
Data Drives: 12 X 12TB 3.5 7.2K 6Gbps SAS
OS Drives: 2 X 256GB SSD
2X RAID6 Volumes created with 6 Drives each. The RAID Controller shows the drives are Optimal. I have also performed a verification of the volumes, it is good.
TrueNAS:
Version : 12.0-U1
Pools : Two Pools with 1 drive each, 43TB. Vol-01 and Vol-02.
I had to re-install the Truenas 12.0-U3, and the system came up with none of the pools came up. I went onto the UI and under Storage-> Pools, did an add Pool, and selected "Import Existing Pool", the system comes back with Vol-01 and Vol-02. Vol-02 imports fine no issues, no corruption of data.
When I select Vol-01 and perform the import, the kernel panics again. That means I have to redo the whole process again.
I then tried the following:
Boot and drop to boot prompt set the following flags:
set zfs:zfs_debug=1
set zfs:zfs_recover=1
set aok=1
Boot in multi-user mode, not Single user. Import Vol-02, and ran the following command on Vol-01
zdb -e -bcsvL Vol-01
Here is the ouput:
root@truenas[~]# zdb -e -bcsvL Vol-01
Traversing all blocks to verify checksums ...
385G completed ( 23MB/s) estimated time remaining: 442642hr 38min 50sec
bp count: 3790779
ganged count: 0
bp logical: 494650703872 avg: 130487
bp physical: 412161922048 avg: 108727 compression: 1.20
bp allocated: 412962242560 avg: 108938 compression: 1.20
bp deduped: 0 ref>1: 0 deduplication: 1.00
Normal class: 412947922944 used: 0.86%
additional, non-pointer bps of type 0: 6909
Dittoed blocks on same vdev: 27987
Blocks LSIZE PSIZE ASIZE avg comp %Total Type
- - - - - - - unallocated
2 32K 8K 24K 12K 4.00 0.00 object directory
45 54K 26K 540K 12K 2.08 0.00 object array
1 16K 4K 12K 12K 4.00 0.00 packed nvlist
- - - - - - - packed nvlist size
- - - - - - - bpobj
- - - - - - - bpobj header
- - - - - - - SPA space map header
7.42K 853M 277M 830M 112K 3.08 0.21 SPA space map
1 36K 36K 36K 36K 1.00 0.00 ZIL intent log
1.74K 33.5M 7.07M 20.3M 11.6K 4.73 0.01 DMU dnode
10 40K 40K 84K 8.40K 1.00 0.00 DMU objset
- - - - - - - DSL directory
12 6K 512 12K 1K 12.00 0.00 DSL directory child map
- - - - - - - DSL dataset snap map
21 290K 72K 216K 10.3K 4.02 0.00 DSL props
- - - - - - - DSL dataset
- - - - - - - ZFS znode
- - - - - - - ZFS V0 ACL
3.61M 460G 384G 384G 106K 1.20 99.78 ZFS plain file
272 2.60M 544K 1.45M 5.47K 4.89 0.00 ZFS directory
9 9K 9K 72K 8K 1.00 0.00 ZFS master node
- - - - - - - ZFS delete queue
- - - - - - - zvol object
- - - - - - - zvol prop
- - - - - - - other uint8[]
- - - - - - - other uint64[]
- - - - - - - other ZAP
- - - - - - - persistent error log
1 128K 8K 24K 24K 16.00 0.00 SPA history
- - - - - - - SPA history offsets
- - - - - - - Pool properties
- - - - - - - DSL permissions
- - - - - - - ZFS ACL
- - - - - - - ZFS SYSACL
- - - - - - - FUID table
- - - - - - - FUID table size
1 1K 1K 12K 12K 1.00 0.00 DSL dataset next clones
- - - - - - - scan work queue
- - - - - - - ZFS user/group/project used
- - - - - - - ZFS user/group/project quota
- - - - - - - snapshot refcount tags
- - - - - - - DDT ZAP algorithm
- - - - - - - DDT statistics
- - - - - - - System attributes
- - - - - - - SA master node
9 13.5K 13.5K 72K 8K 1.00 0.00 SA attr registration
18 288K 72K 144K 8K 4.00 0.00 SA attr layouts
- - - - - - - scan translations
- - - - - - - deduplicated block
- - - - - - - DSL deadlist map
- - - - - - - DSL deadlist map hdr
1 1K 1K 12K 12K 1.00 0.00 DSL dir clones
- - - - - - - bpobj subobj
- - - - - - - deferred free
- - - - - - - dedup ditto
36 475K 168K 576K 16K 2.84 0.00 other
3.62M 461G 384G 385G 106K 1.20 100.00 Total
Block Size Histogram
block psize lsize asize
size Count Size Cum. Count Size Cum. Count Size Cum.
512: 313 156K 156K 313 156K 156K 0 0 0
1K: 105 114K 270K 105 114K 270K 0 0 0
2K: 42 134K 405K 42 134K 405K 0 0 0
4K: 22.8K 91.3M 91.7M 73 330K 735K 17.2K 69.0M 69.0M
8K: 39.8K 415M 507M 24 204K 939K 29.6K 294M 363M
16K: 99.8K 2.04G 2.53G 2.70K 43.2M 44.1M 114K 2.38G 2.73G
32K: 248K 11.4G 13.9G 17.9K 573M 617M 249K 11.4G 14.1G
64K: 1.23M 117G 131G 30 3.48M 621M 1.23M 117G 131G
128K: 1.98M 253G 384G 3.59M 459G 460G 1.98M 253G 384G
256K: 0 0 384G 0 0 460G 1.33K 386M 385G
512K: 0 0 384G 0 0 460G 0 0 385G
1M: 0 0 384G 0 0 460G 0 0 385G
2M: 0 0 384G 0 0 460G 0 0 385G
4M: 0 0 384G 0 0 460G 0 0 385G
8M: 0 0 384G 0 0 460G 0 0 385G
16M: 0 0 384G 0 0 460G 0 0 385G
capacity operations bandwidth ---- errors ----
description used avail read write read write read write cksum
Vol-01 385G 43.2T 441 0 46.4M 0 0 0 0
/dev/gptid/19708e9b-6e31-11eb-9230-0cc47a17595c 385G 43.2T 441 0 46.4M 0 0 0 0
I tried to perform a:
# zpool scrub Vol-01 it says no such volume
Tried to import the pool, the system panics again.
Any help would be welcome, as we have data on this pool that needs to be recovered.
Here are the details of my installation:
Server Details
Server : Supermicro SuperChassis 826BA-R1K28WB 2U Server W/ X9DRW-3F
Memory : 64GB
Storage Controller : ADAPTEC 71605
Data Drives: 12 X 12TB 3.5 7.2K 6Gbps SAS
OS Drives: 2 X 256GB SSD
2X RAID6 Volumes created with 6 Drives each. The RAID Controller shows the drives are Optimal. I have also performed a verification of the volumes, it is good.
TrueNAS:
Version : 12.0-U1
Pools : Two Pools with 1 drive each, 43TB. Vol-01 and Vol-02.
I had to re-install the Truenas 12.0-U3, and the system came up with none of the pools came up. I went onto the UI and under Storage-> Pools, did an add Pool, and selected "Import Existing Pool", the system comes back with Vol-01 and Vol-02. Vol-02 imports fine no issues, no corruption of data.
When I select Vol-01 and perform the import, the kernel panics again. That means I have to redo the whole process again.
I then tried the following:
Boot and drop to boot prompt set the following flags:
set zfs:zfs_debug=1
set zfs:zfs_recover=1
set aok=1
Boot in multi-user mode, not Single user. Import Vol-02, and ran the following command on Vol-01
zdb -e -bcsvL Vol-01
Here is the ouput:
root@truenas[~]# zdb -e -bcsvL Vol-01
Traversing all blocks to verify checksums ...
385G completed ( 23MB/s) estimated time remaining: 442642hr 38min 50sec
bp count: 3790779
ganged count: 0
bp logical: 494650703872 avg: 130487
bp physical: 412161922048 avg: 108727 compression: 1.20
bp allocated: 412962242560 avg: 108938 compression: 1.20
bp deduped: 0 ref>1: 0 deduplication: 1.00
Normal class: 412947922944 used: 0.86%
additional, non-pointer bps of type 0: 6909
Dittoed blocks on same vdev: 27987
Blocks LSIZE PSIZE ASIZE avg comp %Total Type
- - - - - - - unallocated
2 32K 8K 24K 12K 4.00 0.00 object directory
45 54K 26K 540K 12K 2.08 0.00 object array
1 16K 4K 12K 12K 4.00 0.00 packed nvlist
- - - - - - - packed nvlist size
- - - - - - - bpobj
- - - - - - - bpobj header
- - - - - - - SPA space map header
7.42K 853M 277M 830M 112K 3.08 0.21 SPA space map
1 36K 36K 36K 36K 1.00 0.00 ZIL intent log
1.74K 33.5M 7.07M 20.3M 11.6K 4.73 0.01 DMU dnode
10 40K 40K 84K 8.40K 1.00 0.00 DMU objset
- - - - - - - DSL directory
12 6K 512 12K 1K 12.00 0.00 DSL directory child map
- - - - - - - DSL dataset snap map
21 290K 72K 216K 10.3K 4.02 0.00 DSL props
- - - - - - - DSL dataset
- - - - - - - ZFS znode
- - - - - - - ZFS V0 ACL
3.61M 460G 384G 384G 106K 1.20 99.78 ZFS plain file
272 2.60M 544K 1.45M 5.47K 4.89 0.00 ZFS directory
9 9K 9K 72K 8K 1.00 0.00 ZFS master node
- - - - - - - ZFS delete queue
- - - - - - - zvol object
- - - - - - - zvol prop
- - - - - - - other uint8[]
- - - - - - - other uint64[]
- - - - - - - other ZAP
- - - - - - - persistent error log
1 128K 8K 24K 24K 16.00 0.00 SPA history
- - - - - - - SPA history offsets
- - - - - - - Pool properties
- - - - - - - DSL permissions
- - - - - - - ZFS ACL
- - - - - - - ZFS SYSACL
- - - - - - - FUID table
- - - - - - - FUID table size
1 1K 1K 12K 12K 1.00 0.00 DSL dataset next clones
- - - - - - - scan work queue
- - - - - - - ZFS user/group/project used
- - - - - - - ZFS user/group/project quota
- - - - - - - snapshot refcount tags
- - - - - - - DDT ZAP algorithm
- - - - - - - DDT statistics
- - - - - - - System attributes
- - - - - - - SA master node
9 13.5K 13.5K 72K 8K 1.00 0.00 SA attr registration
18 288K 72K 144K 8K 4.00 0.00 SA attr layouts
- - - - - - - scan translations
- - - - - - - deduplicated block
- - - - - - - DSL deadlist map
- - - - - - - DSL deadlist map hdr
1 1K 1K 12K 12K 1.00 0.00 DSL dir clones
- - - - - - - bpobj subobj
- - - - - - - deferred free
- - - - - - - dedup ditto
36 475K 168K 576K 16K 2.84 0.00 other
3.62M 461G 384G 385G 106K 1.20 100.00 Total
Block Size Histogram
block psize lsize asize
size Count Size Cum. Count Size Cum. Count Size Cum.
512: 313 156K 156K 313 156K 156K 0 0 0
1K: 105 114K 270K 105 114K 270K 0 0 0
2K: 42 134K 405K 42 134K 405K 0 0 0
4K: 22.8K 91.3M 91.7M 73 330K 735K 17.2K 69.0M 69.0M
8K: 39.8K 415M 507M 24 204K 939K 29.6K 294M 363M
16K: 99.8K 2.04G 2.53G 2.70K 43.2M 44.1M 114K 2.38G 2.73G
32K: 248K 11.4G 13.9G 17.9K 573M 617M 249K 11.4G 14.1G
64K: 1.23M 117G 131G 30 3.48M 621M 1.23M 117G 131G
128K: 1.98M 253G 384G 3.59M 459G 460G 1.98M 253G 384G
256K: 0 0 384G 0 0 460G 1.33K 386M 385G
512K: 0 0 384G 0 0 460G 0 0 385G
1M: 0 0 384G 0 0 460G 0 0 385G
2M: 0 0 384G 0 0 460G 0 0 385G
4M: 0 0 384G 0 0 460G 0 0 385G
8M: 0 0 384G 0 0 460G 0 0 385G
16M: 0 0 384G 0 0 460G 0 0 385G
capacity operations bandwidth ---- errors ----
description used avail read write read write read write cksum
Vol-01 385G 43.2T 441 0 46.4M 0 0 0 0
/dev/gptid/19708e9b-6e31-11eb-9230-0cc47a17595c 385G 43.2T 441 0 46.4M 0 0 0 0
I tried to perform a:
# zpool scrub Vol-01 it says no such volume
Tried to import the pool, the system panics again.
Any help would be welcome, as we have data on this pool that needs to be recovered.