TrueNAS Scale Pool Status : Finished

arkaxe

Cadet
Joined
Apr 25, 2021
Messages
4
Hello, this morning I look up at my NAS and see one drive "FAULTED", so I decide to backup before something happen. Now I regret it so bad..
EAVHlKy.png


Is it completely dead ?

Sorry it's my first time with truenas. Im a noob.
 

arkaxe

Cadet
Joined
Apr 25, 2021
Messages
4
Hello, this morning I look up at my NAS and see one drive "FAULTED", so I decide to backup before something happen. Now I regret it so bad..
EAVHlKy.png


Is it completely dead ?

Sorry it's my first time with truenas. Im a noob.
I forget to give all info :
CPU : Ryzen 5 1400
RAM : 16GB (not ECC)
Motherboard : GIGABYTE B450 AORUS M AM4 AMD B450 SATA 6Gb/s Micro ATX
Drive :
Pool 1 : 4x3tb SAS Raidz1
Pool 2 : 2x1tb SSD mirror
Pool 3 : (nothing important there) 1x12tb

5CjpDGz.png

XLwFtQA.png
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
What is the output of "zpool status"? (In CODE tags please)

One odd / good thing about ZFS is that you have have faults on every single disk, and no spare sectors on any of those disks, and may not loose data. As long as there is no more than 1 bad block per RAID-Z1 stripe, you may not have lost any data. Even on occasion with bad RAID-Z1 stripes, if it contained metadata, then that's duplicated by default.

I once lost a block on my single disk pool for media. But I could not figure out what file was affected. Later I realized that it must have occurred in metadata that automatically has a duplicate copy. Even on single disk pools. (Note my media pool has multiple backups, just the on-line copy is single disk.)
 

arkaxe

Cadet
Joined
Apr 25, 2021
Messages
4
I just restart and got no drive "failed" again, pretty weird tho.

Code:
truenas# zpool status
  pool: Arkaxe
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: resilvered 3.25M in 00:00:17 with 0 errors on Sun Apr 25 17:24:37 2021
config:

        NAME                                      STATE     READ WRITE CKSUM
        Arkaxe                                    DEGRADED     0     0     0
          raidz1-0                                DEGRADED     0     0     0
            0f058f0f-439c-4d1f-9ad1-34347707bc3f  FAULTED      3     9     8  too many errors
            b4cfcd41-e1bc-4cb2-94ac-78696640c1f8  DEGRADED     0     0     0  too many errors
            b0727707-e3ef-477e-a985-f7506aafac6c  ONLINE       0     0     1
            dd91d675-9be2-43dc-a552-fc8f87389ec1  DEGRADED     0     0     0  too many errors

errors: No known data errors
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
@arkaxe If you want your topic to be seen, maybe post in the "SCALE" section and not in the "legacy" (FreeNAS 11.3 and earlier) section? ;-)
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Good news, this means you don't have and data loss;

errors: No known data errors

The tricky thing is with multiple disks having problems in a RAID-Z1, you have to install the replacement disk without removing an existing disk. Then use replace in place. Don't know what the GUI calls it or how to do it from the GUI. But, ZFS will let you replace a disk with another, while both are still in the server. Whence the replacement is done, the "bad" disk is removed from the pool. THEN you can remove the "bad" disk. Repeat until you have a healthy pool

You will also have to figure out WHY you have so many bad disks. Power supply is a possible cause. Or possibly lack of cooling on the hard drives, so they over-heated. Their also have been some bad batches of hard drives in the past.
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
The tricky thing is with multiple disks having problems in a RAID-Z1, you have to install the replacement disk without removing an existing disk. Then use replace in place. Don't know what the GUI calls it or how to do it from the GUI. But, ZFS will let you replace a disk with another, while both are still in the server. Whence the replacement is done, the "bad" disk is removed from the pool. THEN you can remove the "bad" disk. Repeat until you have a healthy pool
Are you sure that is allowed for Raid-Z1? :O Neverknew!

You will also have to figure out WHY you have so many bad disks. Power supply is a possible cause. Or possibly lack of cooling on the hard drives, so they over-heated. Their also have been some bad batches of hard drives in the past.
Yeah thats likely...
Though those disk names also give me some redflags, is OP sure there is no abstraction (raid or otherwise) going on here?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Are you sure that is allowed for Raid-Z1? :O Neverknew!
...
Yes, it's one of the nice features of ZFS. You can even replace a disk in a striped pool, (meaning no redundancy). That said, if the failing disk is in really bad shape, the replace in place can be a lot slower than resorting to redundancy. However, with a RAID-Z1 pool with multiple disks reporting issues, it's more or less your only way to restore the pool.
 

marshalleq

Explorer
Joined
Mar 12, 2016
Messages
88
With multiple errors like that across multiple disks I'd be suspecting a cabling problem or some other chipset related issue unless you know some event happened or had rather old disks. It could be a failed batch of new disks too of course - but history tells me if you've cabled this up - to change the cables out if you can as it's most likely. The thing about ZFS is it tells you about errors, so if you had something else before you may not have ever known....
 
Top