JayG30
Contributor
- Joined
- Jun 26, 2013
- Messages
- 158
Hello,
Today I noticed one of my freenas servers was in a degraded state. I found out a bit late it seems because my email moved the messages to "clutter" (sigh). Anyway, I'm just trying to determine if anyone might see something other than a disk issue.
When I logged into the web GUI (and the CLI zpool status initially) the disk had a few hundred write errors showing for that disk.
In dmesg it showed;
The volume initially showed the disk as unavailable;
I rebooted the server but no change.
So I had someone on site remove the disk for me, 1 to get the S/N and second to see if I could online it and have it rebuild itself. After removing it the status of the disk changed to "removed". Subsequent reboots of the server have made the volume show as "resilvering" but the disk never came online, even after trying to force it online through zfs online command.
The disk is now back to showing "unavailable" as shown above.
EDIT: it seems the disk is now back to showing "removed". It seems to be unavailable on initial reboot while it resilvers and then goes to "removed".
Further more I can't even see the disk in smartctl. It just seems like it is being removed per the dmesg shown above, "(da5:mps0:0:13:0): Periph destroyed". I had hoped to try to check the smartctl readings, but can't since the disk isn't showing up at all.
My gut says the disk went bad. I filed a RMA for it and will go down to check it tomorrow. But perhaps someone might have an idea.
Today I noticed one of my freenas servers was in a degraded state. I found out a bit late it seems because my email moved the messages to "clutter" (sigh). Anyway, I'm just trying to determine if anyone might see something other than a disk issue.
When I logged into the web GUI (and the CLI zpool status initially) the disk had a few hundred write errors showing for that disk.
In dmesg it showed;
Code:
(da5:mps0:0:13:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 555 command timeout cm 0xffffff8000b02718 ccb 0xfffffe004101f000 (noperiph:mps0:0:4294967295:0): SMID 1 Aborting command 0xffffff8000b02718 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 93 50 00 00 40 00 length 32768 SMID 337 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 93 10 00 00 40 00 length 32768 SMID 363 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 92 d0 00 00 40 00 length 32768 SMID 841 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 92 90 00 00 40 00 length 32768 SMID 220 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 92 50 00 00 40 00 length 32768 SMID 748 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 92 10 00 00 40 00 length 32768 SMID 321 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 91 d0 00 00 40 00 length 32768 SMID 515 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 91 90 00 00 40 00 length 32768 SMID 745 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 91 50 00 00 40 00 length 32768 SMID 868 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): WRITE(10). CDB: 2a 00 10 74 8e a0 00 00 40 00 length 32768 SMID 632 terminated ioc 804b scsi 0 state c xfer 0 (da5:mps0:0:13:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 466 terminated ioc 804b scsi 0 state c xfer 0 mps0: IOCStatus = 0x4b while resetting device 0xf (da5:mps0:0:13:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 (da5:mps0:0:13:0): CAM status: Command timeout (da5:mps0:0:13:0): Retrying command da5 at mps0 bus 0 scbus0 target 13 lun 0 da5: <ATA TOSHIBA MG03ACA3 FL1A> s/n 53K7K7JPF detached (da5:mps0:0:13:0): Periph destroyed
The volume initially showed the disk as unavailable;
Code:
[root@freenas] ~# zpool status -v store pool: store state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: scrub repaired 0 in 0h26m with 0 errors on Sun Jul 19 00:26:39 2015 config: NAME STATE READ WRITE CKSUM store DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 gptid/1c383e96-d315-11e4-98c7-0cc47a335ac4 ONLINE 0 0 0 gptid/90b50eaf-d315-11e4-98c7-0cc47a335ac4 ONLINE 0 0 0 gptid/284a6fc3-d316-11e4-98c7-0cc47a335ac4 ONLINE 0 0 0 gptid/c66e0391-d317-11e4-98c7-0cc47a335ac4 ONLINE 0 0 0 gptid/14a02475-d318-11e4-98c7-0cc47a335ac4 ONLINE 0 0 0 559548462891584750 UNAVAIL 3 246 0 was /dev/gptid/5178ef38-d319-11e4-98c7-0cc47a335ac4
I rebooted the server but no change.
So I had someone on site remove the disk for me, 1 to get the S/N and second to see if I could online it and have it rebuild itself. After removing it the status of the disk changed to "removed". Subsequent reboots of the server have made the volume show as "resilvering" but the disk never came online, even after trying to force it online through zfs online command.
The disk is now back to showing "unavailable" as shown above.
EDIT: it seems the disk is now back to showing "removed". It seems to be unavailable on initial reboot while it resilvers and then goes to "removed".
Further more I can't even see the disk in smartctl. It just seems like it is being removed per the dmesg shown above, "(da5:mps0:0:13:0): Periph destroyed". I had hoped to try to check the smartctl readings, but can't since the disk isn't showing up at all.
My gut says the disk went bad. I filed a RMA for it and will go down to check it tomorrow. But perhaps someone might have an idea.
Last edited: