Hello guys,
I set up this FreeNAS system that I have been using to replicate another system.
Hardware: a Dell PowerEdge T420
8 disks attached to a SATA controller.
4 TB each.
I created a Z2 pool using all disks.
Two things: one of the disks had had SMART errors for a while.
I saw this recently, because email monitoring was not set up.
The other: I noticed the other day that there was a message that more than 80% of the storage was used.
80% seems low to me, but I removed some files. Still, free space was around 80%. I wanted to correct this yesterday, but:
At some point (I think yesterday) replication stopped. The system was very slow. I rebooted, but afterwards could not log into this system. I went to the location, and had to reboot the system "the brute force way": power off.
Afterwards, messages like this appeared:
Shortening read at 656867807 from 16 to 10
gptzfsboot: error 16 lba 49
And after a lot of messages:
BIOS drive M: is disk10
read 350 from 34 to 0x59680, error: 0x10
( a few like these)
panic: free: guard1 fail @ 0x1 from unknown:0
The system would not boot with the "guilty" drive attached.
So I removed that disk, and after that the system booted.
It showed this error: pool data0 status Unknown.
I tried to import it:
zpool import -fF data0 cannot import 'data0': no such pool or dataset Destroy and re-create the pool from a backup source.
I thought that with RAIDZ2, TWO disks could fail and data would still be recoverable....
No other disks have SMART errors.
One of the things that I saw was that the disk that was missing (/dev/da2) is now there (again), and new disks that I attach get higher numbers.
Is that important? Will FreeBSD re enumerate disks at boot and still know how to recreate a Z2 dataset?
Losing this data is not serious, but I want to find out what went wrong.
Yes, I know I should have replaced the drive immediately.
Yes, less than 80% free space is not best practice.
BUT... why will this result in total data loss?
I set up this FreeNAS system that I have been using to replicate another system.
Hardware: a Dell PowerEdge T420
8 disks attached to a SATA controller.
4 TB each.
I created a Z2 pool using all disks.
Two things: one of the disks had had SMART errors for a while.
I saw this recently, because email monitoring was not set up.
The other: I noticed the other day that there was a message that more than 80% of the storage was used.
80% seems low to me, but I removed some files. Still, free space was around 80%. I wanted to correct this yesterday, but:
At some point (I think yesterday) replication stopped. The system was very slow. I rebooted, but afterwards could not log into this system. I went to the location, and had to reboot the system "the brute force way": power off.
Afterwards, messages like this appeared:
Shortening read at 656867807 from 16 to 10
gptzfsboot: error 16 lba 49
And after a lot of messages:
BIOS drive M: is disk10
read 350 from 34 to 0x59680, error: 0x10
( a few like these)
panic: free: guard1 fail @ 0x1 from unknown:0
The system would not boot with the "guilty" drive attached.
So I removed that disk, and after that the system booted.
It showed this error: pool data0 status Unknown.
I tried to import it:
zpool import -fF data0 cannot import 'data0': no such pool or dataset Destroy and re-create the pool from a backup source.
I thought that with RAIDZ2, TWO disks could fail and data would still be recoverable....
No other disks have SMART errors.
One of the things that I saw was that the disk that was missing (/dev/da2) is now there (again), and new disks that I attach get higher numbers.
Is that important? Will FreeBSD re enumerate disks at boot and still know how to recreate a Z2 dataset?
Losing this data is not serious, but I want to find out what went wrong.
Yes, I know I should have replaced the drive immediately.
Yes, less than 80% free space is not best practice.
BUT... why will this result in total data loss?
Last edited: