Hi,
The system I just built is basically:
Mobo: AX4-SPE-N AOpen
RAM: 2 Gb
HDDs:
Bank #1: 4x WD EARS
Bank #2: 4 assorted (2x1Tb 1x2Tb (Seagate) 1x2Tb WD EARS)
2 x PCI to Sata controller: Sil 3114 based.
I created two raidz1 as follows:
tank1: using Bank #1
tank2: using Bank #2
What I am experiencing is a low level corruption on scrubs, not on during freenas operation. This is, during use I don't see a single ZFS read, write or cksum error.
When I scrub either tank, I get anywhere between 5 to 25 cksum error on *all* hdds. and the odd irrecoverable error (i.e. the odd file to be restored from backup).
I tried to troubleshoot the following
1 - Thermal issues: the CPU max temp under heavy load is 47 Celcius, the hottest hdd under heavy load is at 37 Celcius
2 - Disable anything that is not in use in BIOS
3 - Run RAM memtest (passed OK without a single error)
4 - Re-seat the Sata controllers
No difference - same error levels.
So, then I replaced 1 Sata controller (tank2) with a Promise S300 TX4. Ran scrub (found and corrected some issues) and then I run scrub a second time. Found more cksum errors on all hdds!!!
Did the same with the remaining Sil3114 controller (tank1). Run scrub - fix errors -run scrub a second time. Found more cksum errors on all hdds.
This indicates that the issue is not the controller.
So now I replaced the remaining Sil3114 with another Promise S300 TX4 and scurbbed. Same issues.
This indicates that the issue is not about mismatched controllers.
This would indicate that the errors are:
a - not I/O related
b - created on the fly by what??? I haven't the faintest. Can't be the hdd's since they are half new / half old and assorted sizes and brands.
However, now not only I have cksum errors, but on shutdown (after 30 min - I have spin down set up for 30 min) I get the following error messages (the hdds vary from shutdown to shutdown):
(ada4:ata6:0:0:0) Spin-down disk failed.
If I shutdown quite soon before turn on (presummably while all the hdds are spinning) I don't get the error.
WTF!
I mean, if I use the cheapo controller I get *LESS* problems that if I use the expensive one? The one that it is widely recommended????
Also, does anybody have any ideas what else can I test? I just run out....
Any help would be appreciated.
The system I just built is basically:
Mobo: AX4-SPE-N AOpen
RAM: 2 Gb
HDDs:
Bank #1: 4x WD EARS
Bank #2: 4 assorted (2x1Tb 1x2Tb (Seagate) 1x2Tb WD EARS)
2 x PCI to Sata controller: Sil 3114 based.
I created two raidz1 as follows:
tank1: using Bank #1
tank2: using Bank #2
What I am experiencing is a low level corruption on scrubs, not on during freenas operation. This is, during use I don't see a single ZFS read, write or cksum error.
When I scrub either tank, I get anywhere between 5 to 25 cksum error on *all* hdds. and the odd irrecoverable error (i.e. the odd file to be restored from backup).
I tried to troubleshoot the following
1 - Thermal issues: the CPU max temp under heavy load is 47 Celcius, the hottest hdd under heavy load is at 37 Celcius
2 - Disable anything that is not in use in BIOS
3 - Run RAM memtest (passed OK without a single error)
4 - Re-seat the Sata controllers
No difference - same error levels.
So, then I replaced 1 Sata controller (tank2) with a Promise S300 TX4. Ran scrub (found and corrected some issues) and then I run scrub a second time. Found more cksum errors on all hdds!!!
Did the same with the remaining Sil3114 controller (tank1). Run scrub - fix errors -run scrub a second time. Found more cksum errors on all hdds.
This indicates that the issue is not the controller.
So now I replaced the remaining Sil3114 with another Promise S300 TX4 and scurbbed. Same issues.
This indicates that the issue is not about mismatched controllers.
This would indicate that the errors are:
a - not I/O related
b - created on the fly by what??? I haven't the faintest. Can't be the hdd's since they are half new / half old and assorted sizes and brands.
However, now not only I have cksum errors, but on shutdown (after 30 min - I have spin down set up for 30 min) I get the following error messages (the hdds vary from shutdown to shutdown):
(ada4:ata6:0:0:0) Spin-down disk failed.
If I shutdown quite soon before turn on (presummably while all the hdds are spinning) I don't get the error.
WTF!
I mean, if I use the cheapo controller I get *LESS* problems that if I use the expensive one? The one that it is widely recommended????
Also, does anybody have any ideas what else can I test? I just run out....
Any help would be appreciated.