I have a strange disk issue with a single disk volume where it is showing as degraded but the SMART stats aren't reporting any errors.
My system consists of:
The latest results from a scrub on the volume are:
I want to stress that I don't care about the data on this drive as I can easily re-create it. I know a single disk volume is bad for this exact reason. I didn't have the money for an extra two decent SSDs to use as my VM store at the time I built this machine so I used a spare laptop disk that hadn't really bee used to much as it was swapped out for an SSD years ago. I have offline backups of my other two volumes.
What I am trying to figure out what is causing the corruption? When I have had scrub errors in the past, I also had SMART stats showing reallocated sectors and pending sectors, but in this case from what I can tell the SMART stats are fine, and the drive passed the long test. My memory is ECC and the BIOS is not reporting any errors. Could it be the Intel SATA controller, but two of my 4TB drives are also attached to it and they aren't manifesting any errors. I'm now suspecting the SATA cable; is there any way to prove / disprove this other than changing the cable and assuming fixed until the issue comes back?
My case as room for 10 3.5" drives and a single 2.5" drive. I'm almost in a position where I can replace this drive and I am thinking of getting a 500GB 850 EVO SATA and an 500GB 850 EVO M.2 as my board does have an M.2 slot. Would there be any issues in mirroring a SATA drive and an M.2 drive considering they are of the same generation?
Any ideas would be appreciated.
My system consists of:
- Lian Li PC-Q26
- Seasonic SS-660XP2
- SUPERMICRO MBD-X10SDV-TLN4F-O
- SAMSUNG 32GB 288-Pin DDR4 SDRAM Registered DDR4 2133 (M393A4K40BB0-CPB0) x2
- LSI LSI00301 (9207-8i) PCI-Express 3.0 x8 Low Profile SATA / SAS Host Controller Card
- WD40EFRX (4 disks in RAIDz1 - imported from my old N36L NAS4Free box that has been retired - 2 attached to LSI HBA and 2 attached to Intel SATA)
- ST8000AS0002 (6 disks in a RAIDz2 - attached to LSI HBA)
- SanDisk Ultra Fit USB (single drive FreeNAS Boot)
- WD7500BPVT (1 disk - temporary storage, VBox jail, Plex jail, nothing I can't recreate - the disk with the issue - attached to Intel SATA)
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 183 151 021 Pre-fail Always - 1816 4 Start_Stop_Count 0x0032 091 091 000 Old_age Always - 9388 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 4060 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1461 191 G-Sense_Error_Rate 0x0032 001 001 000 Old_age Always - 261377 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 50 193 Load_Cycle_Count 0x0032 138 138 000 Old_age Always - 186554 194 Temperature_Celsius 0x0022 113 103 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 4057 -
The latest results from a scrub on the volume are:
Code:
[root@freenas] ~# zpool status -v vm pool: vm state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 1h29m with 9 errors on Sun Jan 29 14:31:33 2017 config: NAME STATE READ WRITE CKSUM vm DEGRADED 0 0 7.17K gptid/b058e8a8-86fc-11e6-893a-0cc47aca83c4 DEGRADED 0 0 14.3K too many errors errors: Permanent errors have been detected in the following files: /var/db/system/rrd-7f4d67ae16c94917b949456bb9f364ad/localhost/df-mnt-vm/df_complex-used.rrd /mnt/vm/jails/VirtualMachines/usr/home/vbox/VirtualBox VMs/SQL Server/SQL Server.vdi /mnt/vm/jails/VirtualMachines/usr/home/vbox/VirtualBox VMs/Windows 10 Evaluation/Windows 10 Evaluation.vdi /mnt/vm/jails/VirtualMachines/usr/home/vbox/VirtualBox VMs/GitLab/GitLab.vdi /mnt/vm/jails/VirtualMachines/usr/home/vbox/VirtualBox VMs/Visual Studio 2017 Test/Visual Studio 2017 Test.vdi
I want to stress that I don't care about the data on this drive as I can easily re-create it. I know a single disk volume is bad for this exact reason. I didn't have the money for an extra two decent SSDs to use as my VM store at the time I built this machine so I used a spare laptop disk that hadn't really bee used to much as it was swapped out for an SSD years ago. I have offline backups of my other two volumes.
What I am trying to figure out what is causing the corruption? When I have had scrub errors in the past, I also had SMART stats showing reallocated sectors and pending sectors, but in this case from what I can tell the SMART stats are fine, and the drive passed the long test. My memory is ECC and the BIOS is not reporting any errors. Could it be the Intel SATA controller, but two of my 4TB drives are also attached to it and they aren't manifesting any errors. I'm now suspecting the SATA cable; is there any way to prove / disprove this other than changing the cable and assuming fixed until the issue comes back?
My case as room for 10 3.5" drives and a single 2.5" drive. I'm almost in a position where I can replace this drive and I am thinking of getting a 500GB 850 EVO SATA and an 500GB 850 EVO M.2 as my board does have an M.2 slot. Would there be any issues in mirroring a SATA drive and an M.2 drive considering they are of the same generation?
Any ideas would be appreciated.