Hey everyone, Its been a while since I've needed any assistance, my FreeNAS build has been humming along problem free for quite a while. Recently I purchased two New* (i'll get to it in a minute) HPE/HGST HUH728060ALE604 drives from a seller on Newegg. These drives were "New" with 2017 manufacture dates, but factory sealed and with zero hours on the clock. I ran some burn in tests, a full SMART test suite. One drive has a couple errors logged but nothing thats alarming, neither has any reallocated sectors, and both have clean SMART reports. The issue I'm having is randomly, usually at night when the scrub runs, one drive (the same drive each time) drops from the array and then gets re-added within 5-10 minutes. Errors are logged in FreeNAS but are cleared on their own. The only reason I even would know is because my email alerts for SMART status. Here is the content of the latest email, I can't really make heads nor tails of it but maybe you guys can:
And here is the SMARTCTL output:
As best I know these drives are NOT SMR (I did extensive research before purchasing these). Is it a bad drive? It seems fine, no issues aside from this odd behavior. These drives relaced older 2TB drives, same SATA ports, same cables. Could it be something as stupid as a bad cable? Anything worrisome jump out at anyone here?
Server specs:
P8Z77-V LK
i5-3450
16gb Ram
6x SATA drives
750W Corsair PSU
8GB Cruzer Fit boot drive
This setup has worked flawlessly for about 5 years so while its old, I don't think its to blame here. I welcome everyone's thoughts on this. Cheers!
Code:
ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: <MB6000GEQUT HPG7> s/n 2RG9YSHX detached GEOM_MIRROR: Device swap1: provider ada3p1 disconnected. (ada3:ahcich3:0:0:0): Periph destroyed ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: <MB6000GEQUT HPG7> ACS-2 ATA SATA 3.x device ada3: Serial Number 2RG9YSHX ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada3: Command Queueing enabled ada3: 5723166MB (11721045168 512 byte sectors) GEOM_ELI: Device mirror/swap1.eli destroyed. GEOM_MIRROR: Device swap1: provider destroyed. GEOM_MIRROR: Device swap1 destroyed. GEOM_MIRROR: Device mirror/swap1 launched (2/2). GEOM_ELI: Device mirror/swap1.eli created. GEOM_ELI: Encryption: AES-XTS 128 GEOM_ELI: Crypto: hardware ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: <MB6000GEQUT HPG7> s/n 2RG9YSHX detached GEOM_MIRROR: Device swap1: provider ada3p1 disconnected. (ada3:ahcich3:0:0:0): Periph destroyed GEOM_ELI: Device mirror/swap1.eli destroyed. GEOM_MIRROR: Device swap1: provider destroyed. GEOM_MIRROR: Device swap1 destroyed. ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: <MB6000GEQUT HPG7> ACS-2 ATA SATA 3.x device ada3: Serial Number 2RG9YSHX ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada3: Command Queueing enabled ada3: 5723166MB (11721045168 512 byte sectors)
And here is the SMARTCTL output:
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0027 133 100 054 Pre-fail Always - 107 3 Spin_Up_Time 0x0023 253 100 024 Pre-fail Always - 48 (Average 48) 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x002f 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0025 128 100 020 Pre-fail Offline - 18 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 910 10 Spin_Retry_Count 0x0033 100 100 060 Pre-fail Always - 0 22 Unknown_Attribute 0x0023 100 100 025 Pre-fail Always - 100 180 Unknown_HDD_Attribute 0x002b 100 100 098 Pre-fail Always - 0 194 Temperature_Celsius 0x0022 181 176 000 Old_age Always - 33 (Min/Max 21/35) 196 Reallocated_Event_Count 0x0033 100 100 000 Pre-fail Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 776 - # 2 Short offline Completed without error 00% 608 - # 3 Short offline Completed without error 00% 440 - # 4 Extended offline Completed without error 00% 396 - # 5 Short offline Completed without error 00% 273 - # 6 Extended offline Completed without error 00% 48 - # 7 Short offline Completed without error 00% 0 -
As best I know these drives are NOT SMR (I did extensive research before purchasing these). Is it a bad drive? It seems fine, no issues aside from this odd behavior. These drives relaced older 2TB drives, same SATA ports, same cables. Could it be something as stupid as a bad cable? Anything worrisome jump out at anyone here?
Server specs:
P8Z77-V LK
i5-3450
16gb Ram
6x SATA drives
750W Corsair PSU
8GB Cruzer Fit boot drive
This setup has worked flawlessly for about 5 years so while its old, I don't think its to blame here. I welcome everyone's thoughts on this. Cheers!