Asrock C2550D4I drive detaching

Status
Not open for further replies.

Codo

Cadet
Joined
May 20, 2012
Messages
8
A few months ago I bought a Asrock C2550D4I for use in my Freenas server. On paper it looked like everything I wanted in a nas board but it has turned out to be the biggest headache of my life. Not knowing any better I originally used the 6 Marvell controlled sata ports for my 6 HDDs and immediately had all sorts of problems. After a bunch of research here I discovered that the Marvell controllers don't play well with FreeNAS so I switched my drives to the Intel driven sata ports. Most of the problems resolved themselves except one, when any kind of high load hits the array (scrubs and resilvers cause it every time) the drive connected to SATA_5 (the last blue sata port on the i/o side of the board) detaches and becomes listed as removed in zpool status. The drive is still there and I've run smart tests on it while the pool says its gone and if I reboot it comes back but as soon as I do a scrub to get it in sync with the array it detaches again. I've swapped drives with another one in the array and I have problems with that drive as soon as I do so I know its the port and not the drive. I've also changed the cable and I still have the same problems.

This is what I get in the log when it happens:
Code:
Apr  7 07:54:25 flatline kernel: ada3 at ahcich13 bus 0 scbus13 target 0 lun 0
Apr  7 07:54:25 flatline kernel: ada3: <WDC WD5000BPKT-00PK4T0 01.01A01> s/n WD-WXB1A7212857 detached
Apr  7 07:54:25 flatline kernel: GEOM_ELI: g_eli_read_done() failed gptid/36d468dc-dcdf-11e4-a681-d050995bd858.eli[READ(offset=201052160, length=32768)]
Apr  7 07:54:25 flatline kernel: GEOM_ELI: g_eli_read_done() failed gptid/36d468dc-dcdf-11e4-a681-d050995bd858.eli[READ(offset=201084928, length=32768)]
Apr  7 07:54:25 flatline kernel: GEOM_ELI: Device gptid/36d468dc-dcdf-11e4-a681-d050995bd858.eli destroyed.
Apr  7 07:54:25 flatline kernel: GEOM_ELI: Detached gptid/36d468dc-dcdf-11e4-a681-d050995bd858.eli on last close.
Apr  7 07:54:32 flatline zfsd: Replace vdev(datas/4755131315324341910) by physical path: Unable to allocate spare target data.
Apr  7 07:54:32 flatline kernel: <118>Apr  7 07:54:32 flatline zfsd: Replace vdev(datas/4755131315324341910) by physical path: Unable to allocate spare target data.
Apr  7 07:55:15 flatline kernel: cam_periph_alloc: attempt to re-allocate valid device ada3 rejected flags 0x118 refcount 2
Apr  7 07:55:15 flatline kernel: adaasync: Unable to attach to new device due to status 0x6
Apr  7 07:55:53 flatline kernel: cam_periph_alloc: attempt to re-allocate valid device ada3 rejected flags 0x118 refcount 2
Apr  7 07:55:53 flatline kernel: adaasync: Unable to attach to new device due to status 0x6
Apr  7 07:56:29 flatline kernel: cam_periph_alloc: attempt to re-allocate valid device ada3 rejected flags 0x118 refcount 2


My question is is there some sort of settings I need to change that can resolve this or do I have a bad board that needs to be RMAd?
 

Codo

Cadet
Joined
May 20, 2012
Messages
8
I moved the drive connected to port SATA_5 to port SATA3_M0 (One of the Marvell SE9172 controlled ports which my research told me was supposed to work fine with FreeNAS) and as soon as I booted the system and unlocked the array the drive connected to port SATA_4 (which I had never had problems with before) and the drive connected to SATA3_M0 both detached when a resilver started so now I'm really at a loss as to what's going on.
 

Codo

Cadet
Joined
May 20, 2012
Messages
8
Another update, I disabled autotune and removed the tuneables it had put in. I thought I was getting somewhere because I was able to "replace" the drive on SATA_4 through the GUI and the system completed a resilver without any other drives detaching. Once that completed I used the GUI to "replace" the SATA3_M0 drive but as soon as the resilvering started it detached again.

This is what I got in the log:
Code:
Apr  7 12:35:59 flatline manage.py: [middleware.exceptions:38] [MiddlewareError: Unable to geli attach gptid/f038470f-dd35-11e4-828d-d050995bd858: geli: Cannot access gptid/f038470f-dd35-11e4-828d-d050995bd858 (error=1).
]
Apr  7 12:36:01 flatline manage.py: [middleware.exceptions:38] [MiddlewareError: Unable to geli attach gptid/6b5733d7-cecd-11e4-955c-d050995bd858: geli: Cannot access gptid/6b5733d7-cecd-11e4-955c-d050995bd858 (error=1).
]
Apr  7 12:36:03 flatline manage.py: [middleware.exceptions:38] [MiddlewareError: Unable to geli attach gptid/6e323c74-dd49-11e4-88fc-d050995bd858: geli: Cannot access gptid/6e323c74-dd49-11e4-88fc-d050995bd858 (error=1).
]
Apr  7 12:36:05 flatline manage.py: [middleware.exceptions:38] [MiddlewareError: Unable to geli attach gptid/6d42c73d-cecd-11e4-955c-d050995bd858: geli: Cannot access gptid/6d42c73d-cecd-11e4-955c-d050995bd858 (error=1).
]
Apr  7 12:36:07 flatline manage.py: [middleware.exceptions:38] [MiddlewareError: Unable to geli attach gptid/6de5e0e5-cecd-11e4-955c-d050995bd858: geli: Cannot access gptid/6de5e0e5-cecd-11e4-955c-d050995bd858 (error=1).
]
Apr  7 12:36:07 flatline notifier: swapoff: /dev/ada0p1.eli: No such file or directory
Apr  7 12:36:07 flatline notifier: geli: No such device: /dev/ada0p1.
Apr  7 12:36:08 flatline notifier: 1+0 records in
Apr  7 12:36:08 flatline notifier: 1+0 records out
Apr  7 12:36:08 flatline notifier: 1048576 bytes transferred in 0.084310 secs (12437141 bytes/sec)
Apr  7 12:36:08 flatline notifier: dd: /dev/ada0: short write on character device
Apr  7 12:36:08 flatline notifier: dd: /dev/ada0: end of device
Apr  7 12:36:08 flatline notifier: 5+0 records in
Apr  7 12:36:08 flatline notifier: 4+1 records out
Apr  7 12:36:08 flatline notifier: 4218880 bytes transferred in 0.079327 secs (53183334 bytes/sec)
Apr  7 12:36:20 flatline kernel: GEOM_ELI: Device gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli created.
Apr  7 12:36:20 flatline kernel: GEOM_ELI: Encryption: AES-XTS 128
Apr  7 12:36:20 flatline kernel: GEOM_ELI:     Crypto: hardware
Apr  7 12:36:25 flatline kernel: GEOM_ELI: Device ada0p1.eli created.
Apr  7 12:36:25 flatline kernel: GEOM_ELI: Encryption: AES-XTS 128
Apr  7 12:36:25 flatline kernel: GEOM_ELI:     Crypto: hardware
Apr  7 12:36:30 flatline kernel: ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
Apr  7 12:36:30 flatline kernel: ada0: <WDC WD5000BPKT-00PK4T0 01.01A01> s/n WD-WXB1A7212857 detached
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34468978688, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34468945920, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34468913152, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34469142528, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34469109760, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34469076992, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34469273600, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34469044224, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34469240832, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Crypto WRITE request failed (error=6). gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli[WRITE(offset=34469208064, length=32768)]
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Device ada0p1.eli destroyed.
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Detached ada0p1.eli on last close.
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Device gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli destroyed.
Apr  7 12:36:30 flatline kernel: GEOM_ELI: Detached gptid/56b5bb38-dd5d-11e4-88fc-d050995bd858.eli on last close.
Apr  7 12:36:30 flatline kernel: (ada0:ahcich0:0:0:0): Periph destroyed
Apr  7 12:36:33 flatline kernel: ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
Apr  7 12:36:33 flatline kernel: ada0: <WDC WD5000BPKT-00PK4T0 01.01A01> ATA-8 SATA 2.x device
Apr  7 12:36:33 flatline kernel: ada0: Serial Number WD-WXB1A7212857
Apr  7 12:36:33 flatline kernel: ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Apr  7 12:36:33 flatline kernel: ada0: Command Queueing enabled
Apr  7 12:36:33 flatline kernel: ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
Apr  7 12:36:33 flatline kernel: ada0: quirks=0x1<4K>
Apr  7 12:36:33 flatline kernel: ada0: Previously was known as ad4
Apr  7 12:36:34 flatline zfsd: Replace vdev(datas/12839381435165279523) by physical path: Unable to allocate spare target data.
Apr  7 12:36:35 flatline kernel: <118>Apr  7 12:36:34 flatline zfsd: Replace vdev(datas/12839381435165279523) by physical path: Unable to allocate spare target data.


HELP!!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
You have something very wrong. No clue what, but I'd try a BIOS and IPMI update. Barring that, try a spare PSU to make sure this isn't because of crappy power.
 

Codo

Cadet
Joined
May 20, 2012
Messages
8
Thanks for the reply. After spending the night backing up my data I've replaced the power supply with one that has 50% more wattage and so far so good. I'm rebuilding one of the problem drives now but I've been here before. I'm withholding judgement until they are both rebuilt and problem free for a bit. I believe I have the latest BIOS/BMC versions already, do you think it would be worth reflashing anyway or is it just important that I have the latest versions?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I never try to reflash. They have their own CRCs that fail if things are really broken.
 

Codo

Cadet
Joined
May 20, 2012
Messages
8
Both bad drives have been rebuilt and the array has survived a scrub and a performance test and been working about 6 hours so far so if nothing else I feel like I'm on to something with the replacement power supply. Thank you again for your attention, I'll reply to this thread again if the problems resurface.
 
Status
Not open for further replies.
Top