Disk becomed unavailable after offline/online operation

Blai Bonet · Apr 29, 2015

Hi all,

I wanted to find out the physical location of all my drives in freenas since I didn't record them when I built the box. I decided to do the following for each disk in the raidz2 volume:

1. offlining the device
2. run dd on the resulting (degraded) system to watch the disk LEDs to see which one was inactive
3. pull out the disk (it is a hot swappable system) and write down the serial number
4. online the device again and wait for resilvering

Everything was working ok. However, unexpectedly after step 4 (online the device), one disk become unavailable. I got the following message:

warning: device 'gptid/562624fb-e3ec-11e4-b1f3-d0509951786c' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present

Since the system and disks are new (less than 1 month), I think they are ok and this a software issue. What can I do? Can I re-format/re-label the disk and pretend it is another brand new disk and install? Is there a way to test if the disk is really bad? Should I reboot? Any suggestion is welcomed?

Cheers

Blai

SweetAndLow · Apr 29, 2015

Why didn't you just shutdown and look at the drives to get the serial number?

It looks like you might have an issue with one of your drives. Did you burn them in? What does the smart data report for the drive?

Blai Bonet · Apr 29, 2015

Smartctl didn't recognise the device. It asked me to specify type with -d but I did't know which type to fill in.

HOWEVER, I just rebooted the box and voila! glabel status shows the device again, smartctl is also
working, and zpool online successfully joined the disk to the existing pool. It was something weird
but everything seems back to normal.

Any thoughts?

SweetAndLow · Apr 29, 2015

why are you offlining the disk in the first place? How is that different than just looking at the serial number printed on the disk?

Blai Bonet · Apr 29, 2015

If I don't offline, I cannot extract the disks safely. The other option would be to shutdown the box and see the serials one by one. I wanted to test the hot swappable capability along the way. However, something went odd. Could it be that the SATA controller is not handling correctly the hot swappable thing? Is there a way to see information about this in FreeNAS?

Apollo · Apr 29, 2015

We need details on your freenas box.
Offlining one drive then putting it back and repeating this on all the other drives might corrupt your latest saved files, if written when disk was last offlined.
You need to run "scrub" to resynchronize and check your pool. You should expect some resilvering to take place but only for the files written after the offlining.

Hotswapping is depenedant on SATA settings, ie: AHCI, IDE or RAID.
You should have it at AHCI.
Are you using a HBA controller? It may not be supported by Freenas or having other issues.

Blai Bonet · Apr 29, 2015

The motherboard is AsRock C2550D4I. I checked the BIOS and all the controllers are in AHCI mode except the Marvell SE9230 that I believed is in JBOD mode (there is no option to set it in AHCI mode). I have 6 1TB WD REDs. I am now doing the scrub over the volume tank with all 6 disks (I was able to online the faulty disk). However, the scrub shows some cksum errors:

pool: tank
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub in progress since Wed Apr 29 22:11:45 2015
355G scanned out of 574G at 302M/s, 0h12m to go
2.76M repaired, 61.85% done

config:

NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/5531e3b3-e3ec-11e4-b1f3-d0509951786c ONLINE 0 0 0
gptid/55ac0db0-e3ec-11e4-b1f3-d0509951786c ONLINE 0 0 0
gptid/562624fb-e3ec-11e4-b1f3-d0509951786c DEGRADED 0 0 323 too many errors (repairing)
gptid/56ab1b29-e3ec-11e4-b1f3-d0509951786c ONLINE 0 0 0
gptid/572600d3-e3ec-11e4-b1f3-d0509951786c ONLINE 0 0 0
gptid/57a0fe7c-e3ec-11e4-b1f3-d0509951786c ONLINE 0 0 0

errors: No known data errors

I had the problem with the device that is now in degraded state.

Any advice?

Thanks!

Apollo · Apr 29, 2015

Just as I expected. When you offline the disk it is not part of the redundancy anymore, but ZFS will try to maintain the data. So it is my understanding the 2.76M repaired are part of a files that was saved on the pool but not on the offlined drive.
Just don't offline the drive one at a time as it will lead to more serious issues in the future. You take a chance you may lose your pool and all the data on it.
The JBOD may be the problem why you are not getting SMART data.
Resilveing should now be done or about.
To clear the pool status, do:

Code:

 zpool clear tank

To be safe run scrub once more.

Blai Bonet · Apr 29, 2015

The previous scrub ended with 934 cksum errors and 21M repaired. I did the clean and now another scrub is running. Hopefully everything would be ok.

Did this happen because of my hardware? Was it something likely to happen with any hardware?

BTW, I read elsewhere that one strategy to increase the size of your volume when you don't have room for new disks is to replace each disk in the volume with bigger disks, one by one, waiting for resilvering to complete before moving to the next one. Supposedly, at the end, the whole volume grows to the capacity of the new disks. What do you think about this?

Apollo · Apr 29, 2015

Blai Bonet said:
The previous scrub ended with 934 cksum errors and 21M repaired. I did the clean and now another scrub is running. Hopefully everything would be ok.

Did this happen because of my hardware? Was it something likely to happen with any hardware?

Read my previous post, the answer is there.
Offlining as you did is bad. Hardware is not at fault.
Just a user inflicted punishment.

BTW, I read elsewhere that one strategy to increase the size of your volume when you don't have room for new disks is to replace each disk in the volume with bigger disks, one by one, waiting for resilvering to complete before moving to the next one. Supposedly, at the end, the whole volume grows to the capacity of the new disks. What do you think about this?

I have never experienced with the issue, but I have read about. One of the condition is to have one of the ZFS option set to allow for pool expansion.
The reason why you need to do resilvering (from a forced scrub) is for ZFS to reconstruct the data on the new drive from the data and parity blocks that are on the remaining pool drives.
In your case, the data was intact except for when a file was being written or after the offlining. So when you scrub the pool, ZFS has no need to replacing the data on the reinserted drive as it validates each block with the corresponding parities on the other drives. Only the block that fails parity check is updated. As opposed to hardware RAID, a newly reinsterted drive will undergo entire RAID synchronization, which means the drive will have to be written all over again.

mjws00 · Apr 29, 2015

The real problem is how swap is handled. The disk can get locked up as zfs does not always release the GEOM provider correctly. https://bugs.pcbsd.org/issues/493 So when you try to online it... it can remain faulted. The GUI does try and handle swap gracefully when you offline via the button. Haven't read the code on the new online option. The good news is a reboot will fix it.

In theory your actions are fine. In practice, you can see there is more to the story. The ultra conservatives will always have you shut down to replace or offline\online a disk. Personally, I want to know that my server can hot-swap properly. An LSI HBA and Supermicro chassis don't have any issues what so ever. Marvell controller in an unspecified chassis. Who can say.

I've hit your error on good gear, so would be slow to blame the board. That said, I'd only trust the Intel controllers on that board. If you are just testing you can temporarily disable swap 'swapoff -a' and 'swapon -a'. In a perfect world it should never be used ;).

I'm hoping one day, since we have larger boot devices and zfs.. we can get swap off our data pools. BSD and our controllers and hardware have had hot-swap right for a long time. Thankfully it looks like the thought has crossed the devs mind as well. https://bugs.pcbsd.org/issues/9235#note-3

You are supposed to be doing these sorts of things during burn-in and the get to know the intricate details of your hardware phase. Not on live data. Good luck. Play safe.

Apollo · Apr 30, 2015

mjws00 said:
You are supposed to be doing these sorts of things during burn-in and the get to know the intricate details of your hardware phase. Not on live data. Good luck. Play safe.

Trouble is that you may have done it once during burn-in process but when it happens later on, it could be month or years. What are the odds you will remember the procedure if you do not keep up with it.

Important Announcement for the TrueNAS Community.

Disk becomed unavailable after offline/online operation

Blai Bonet

Cadet

SweetAndLow

Sweet'NASty

Blai Bonet

Cadet

SweetAndLow

Sweet'NASty

Blai Bonet

Cadet

Apollo

Wizard

Blai Bonet

Cadet

Apollo

Wizard

Blai Bonet

Cadet

Apollo

Wizard

mjws00

Guru

Apollo

Wizard

Similar threads