Unsure if disk is failing (local daily security run)

Status
Not open for further replies.

runnin17

Dabbler
Joined
May 24, 2014
Messages
13
Got this email this morning. Running a seven 3TB drive raidz3 array. Areca 1882 card (I know about the areca issues, just don't have the money to change out cards right now).

Code:
freenas.local kernel log messages:
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 c4 18 00 00 08 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 9e d0 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 9f 50 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 b1 d8 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 b2 58 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 b3 58 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 ba 98 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 bb 98 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 bb 18 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 c3 d8 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 c4 58 00 00 48 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error
> (da0:arcmsr0:0:1:0): READ(10). CDB: 28 00 b7 59 c4 a0 00 00 80 00
> (da0:arcmsr0:0:1:0): CAM status: SCSI Status Error
> (da0:arcmsr0:0:1:0): SCSI status: Check Condition
> (da0:arcmsr0:0:1:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:arcmsr0:0:1:0): Error 5, Unretryable error

-- End of security output --


So then I started to read a bit and ran zpool status along with smartctl -a on all the drives. This is what I got. Sadly the SCSI controller is an Areca 1882 so I have a feeling the SMART data is not all that useful. I am in the process of doing my weekly backup of the data. I guess I need to think about replacing da0.

zpool status and smart data is below.
 

runnin17

Dabbler
Joined
May 24, 2014
Messages
13
ZPOOL STATUS

Code:
pool: wdredz3                                                                                                                    
state: ONLINE                                                                                                                     
  scan: scrub repaired 1000K in 8h11m with 0 errors on Sun Aug 17 08:11:19 2014                                                    
config:                                                                                                                            
                                                                                                                                   
        NAME                                            STATE     READ WRITE CKSUM                                                 
        wdredz3                                         ONLINE       0     0     0                                                 
          raidz3-0                                      ONLINE       0     0     0                                                 
            gptid/384b7f43-09e7-11e4-9d88-b8975a0084f0  ONLINE       0     0     0                                                 
            gptid/38f0cbe2-09e7-11e4-9d88-b8975a0084f0  ONLINE       0     0     0                                                 
            gptid/3926f110-09e7-11e4-9d88-b8975a0084f0  ONLINE       0     0     0                                                 
            gptid/39645b06-09e7-11e4-9d88-b8975a0084f0  ONLINE       0     0     0                                                 
            gptid/39980ecd-09e7-11e4-9d88-b8975a0084f0  ONLINE       0     0     0                                                 
            gptid/39ccb45f-09e7-11e4-9d88-b8975a0084f0  ONLINE       0     0     0                                                 
            gptid/3a0263f4-09e7-11e4-9d88-b8975a0084f0  ONLINE       0     0     0                                                 
                                                                                                                                   
errors: No known data errors          


SMARTCTL DATA

Code:
[root@freenas ~]# smartctl -a /dev/da0                                                                                    
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p4 amd64] (local build)                                                         
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org                                                        
                                                                                                                                   
=== START OF INFORMATION SECTION ===                                                                                               
Vendor:               WDC                                                                                                          
Product:              WD30EFRX-68AX9N0                                                                                             
Revision:             R001                                                                                                         
User Capacity:        3,000,592,982,016 bytes [3.00 TB]                                                                            
Logical block size:   512 bytes                                                                                                    
Rotation Rate:        10000 rpm                                                                                                    
Logical Unit id:      0x001b4d2021866207                                                                                           
Serial number:        WD-WMC1T1268267                                                                                              
Device type:          disk                                                                                                         
Transport protocol:   Fibre channel (FCP-2)                                                                                        
Local Time is:        Mon Aug 18 13:35:32 2014 PDT                                                                                 
SMART support is:     Available - device has SMART capability.                                                                     
SMART support is:     Enabled                                                                                                      
Temperature Warning:  Disabled or Not Supported      
 

Any help is appreciated.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If you got no other output than that for smartctl -a /dev/da0 then you are right.. Areca has just fscked you. I know two people that used Areca 1882s and both contact me because they lost their pools. They probably wouldn't have lost their pools had SMART been able to function.

So what's my advice?

1. Replace da0 and resilver.
2. Ditch that Areca card ASAP and get something appropriate for FreeNAS and ZFS.

You aren't alone with the "I don't have the money to replace it". What surprises me is that when you lose your pool your priorities change... rapidly. One person paid for data recovery (which isn't cheap) and he could have fixed *every* one of his issues with his server for less than the cost of recovery.

Now the crappy part is that you don't know if 4 other disks are failing or not. You might resilver and find you now have corrupted files. Not fun at all.

Good luck!
 

runnin17

Dabbler
Joined
May 24, 2014
Messages
13
Well the data gets weekly backups and it is just a home file server. I have learned in the past about backups.

Anyone have any recommendations on decent controller cards that are not the M1015? I could do that, but I would probably have to add on an SAS expander too.
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
The m1015 is pretty much the best bang for your buck you're going to find around... the M1115 works just as well and is about the same price.
 

runnin17

Dabbler
Joined
May 24, 2014
Messages
13
This is a noobish question, but are people just using multiple M1015's? Seems like I can either use 2-3 M1015's or try my luck with an SAS expander.
After some quick googling though it does seem like the M1015 is the best option for the price. Since it is a simple file server I would not be too concerned with blazing speed, it just has to work and give me the correct data :D
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I use an M1015 + Intel SAS expander. I've got 24 WD Green drives and it's been amazing for over 2 years.
 
Status
Not open for further replies.
Top