Disk replace has led to checksum errors, hot spare resilver, multiple sets of mirrors

Status
Not open for further replies.

fbtech

Cadet
Joined
May 29, 2017
Messages
3
Hi all,

Help! We currently have a spectacular zpool status which contains the following:

Code:
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 4.47T in 17h31m with 0 errors on Thu Nov 16 14:28:25 2017
config:

	NAME												STATE	 READ WRITE CKSUM
	tank											  DEGRADED	 0	 0	 0
...
	  mirror-3										  DEGRADED	 0	 0	 0
		replacing-0									 DEGRADED	 0	 0   111
		  spare-0									   DEGRADED	 0	 0	95
			gptid/6295ae4e-2bbe-11e7-a503-002590c8d294  DEGRADED	 0	 0   157  too many errors
			gptid/81874caf-2bdf-11e7-a503-002590c8d294  ONLINE	   0	 0   266
		  gptid/244f1f47-c9d8-11e7-ab2e-002590c8d294	ONLINE	   0	 0   311
		gptid/0e5ae824-2bda-11e7-a503-002590c8d294	  ONLINE	   0	 0	 0
		gptid/6402f3f5-2bbe-11e7-a503-002590c8d294	  ONLINE	   0	 0	 0
	spares
	  5571117140374537679							   INUSE	 was /dev/gptid/81874caf-2bdf-11e7-a503-002590c8d294
..
errors: No known data errors


We attempted to replace the disk gptid 6295ae4e-2bbe (/dev/da8) using the GUI "Replace" button.

We think this has happened because:
- we have turned on autoreplace for hot spares with zpool set autoreplace=on tank
- we didn't offline the disk before selecting "replace" on the FreeNAS GUI.

replacing-0 is the result of a disk replace command in FreeNAS GUI of disk gptid 6295ae4e-2bbe (/dev/da8) with an available disk gptid 244f1f47-c9d8 (da22)

Around two hours later, we got this in an alert email: "Device: /dev/da8 [SAT], 65527 Currently unreadable (pending) sectors". So it seems ZFS has detected da8 as failed and pulled in the spare with gpt-id 81874caf-2bdf (da21) during a replacement resilver. da8 was marked as "too many errors" in zfs status which did not appear before this point.

The resilver has now finished, and da8 is now showing as online. We cannot detach or offline da8 in the GUI. Maybe it is because 'replacing' is still happening, despite the resilver showing as finished?

We are wondering what people might think is safest way to get this back to a nice 3 way mirror, maybe by removing all of these disks with checksum errors and doing a zpool attach of a freshly formatted disk. What are the risks of all of these checksum errors, and that it tried to resilver from a bad disk?

We are running a Supermicro server with some Hitachi 3TB and a WD Black 4TB in each mirror, FreeNAS-9.10.2-U3 (e1497f269) Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz Memory 98241MB of enterprise ECC RAM

Thanks!
Michael
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Not trying to give you a ration, just so youknow for next time. Since you have a 3way mirror, you should have just offlined the bad drive and removed it, and if you want to manually replace a drive, you must turn autoreplace off first.
You will need to use the command line to fix this.
Are you familiar with how to access the terminal with SSH and the zfs/zpool commands?

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Code:
  gptid/6295ae4e-2bbe-11e7-a503-002590c8d294 DEGRADED  0  0 157 too many errors
  gptid/81874caf-2bdf-11e7-a503-002590c8d294 ONLINE  0  0 266
  gptid/244f1f47-c9d8-11e7-ab2e-002590c8d294  ONLINE  0  0 311
To me, it looks like all three of those disks are potentially bad.
I would pull them all out and connect them to another system to do extended diagnostics on them and only use them again if they pass the diagnostics which would include totally wiping the disks.

After you pull those three disks, if you want to keep the 3way mirror, you would need to attach a replacement disk.

The command for removing a disk from a mirror is this:
zpool detach your-pool your-device
In your situation, I am not sure that zfs will let you detach the detach the devices. It looks like an opperation may still be in progress.
You have a valid 2way mirror with
Code:
  gptid/0e5ae824-2bda-11e7-a503-002590c8d294  ONLINE  0  0  0
  gptid/6402f3f5-2bbe-11e7-a503-002590c8d294  ONLINE  0  0  0
I don't think your data is at risk, but these drives are questionable, even the spare. You shouldn't get errors like that.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
What are the drives, and what is the controller? What is the memory, and have you tested it? That set of checksum errors is pretty phenomenal and indicates you have something wrong in the the system. da8 is probably in bad shape, or it could be a sign that you have bad power.
 

fbtech

Cadet
Joined
May 29, 2017
Messages
3
Hi Chris and rs,
Thanks very much! da8 is apparently in very bad shape - 65527 unreadable (pending) sectors. We have dual power supplies for power, and haven't seen any power issues on any other machines , in graphs so we don't think it is a power issue. We checked RAM with Memtest86 but it has been running for around a year without any issues, usually 90% full for ARC - sticks are Samsung 16GB 1333 MT/s DR x4 PC3L-10600 ECC Registered 240-pin RDIMM. lsi-sas9211-8i HBA in IT mode.

I am wondering whether to scrub the pool before pulling the disks? It has been just on around 30 days now since last scrub. We ran a sas2ircu command to blink a disk not long ago and it hard rebooted so we don't use sas2ircu any more, but maybe this is indicative of a RAID card issue.

The disks are WD Yellow Datacenter 3TB and new disks/spares are WD Gold Datacenter 4TB, and pretty much only 5 months old - probably unlikely to be bad disks. I think it resilvered from da8 which has led to these checksum errors, but will pull them for sure, thanks.

I have through command line offlined the disk before replacement on another system with 3 mirrors using zpool commands, but I am not confident what order to pull the disks in this situation. Anything worrying with:
-turn off autoreplace
-offline and detach da8 again through command line
-if it complains or finishes, offline the hotspare it pulled in and try to offline da8 again

Process:
Code:
zpool set autoreplace=off tank

For each disk:
Code:
zpool offline tank /dev/gptid/6295ae4e-2bbe-11e7-a503-002590c8d294
zpool status
zpool detach tank (ID of disk)


If it doesn't let me offline, offline the hot spare it pulled instead gptid/81874caf-2bdf and then try to offline da8 again
  • 3. take out the disk, do the same to gptid/6295ae4e-2bbe-11e7-a503-002590c8d294 (hot spare) and then gptid/244f1f47-c9d8-11e7-ab2e-002590c8d294 and burn them in/check.
  • 4. Force add new drive to mirror by resilvering drive to new disk with zpool attach: The first disk in the command below is the healthy disk to resilver from, the second disk is the NEW disk:
Code:
 zpool attach fbs101 gptid/0e5ae824-2bda.. gptid/new_id 


All the best,
Michael
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
What is the firmware on the LSI? I remember reading there is a certain version that has problems.

If you have a seagate, I would try that as a replacement. I wouldn't bother with off-lining any drive unless the temporary lack of disk light activity helps you identify da8.
 
Last edited:

fbtech

Cadet
Joined
May 29, 2017
Messages
3
So in this situation, all that needed to be done was a zpool detach of the faulty disk da8 . This removed the hot spare gptid 81874caf-2bdf from the pool as well. Afterwards we were back to a 3-way mirror with no checksum errors listed for any disks in the pool:

mirror-3 ONLINE 0 0 0
gptid/244f1f47-c9d8-11e7-ab2e-002590c8d294 ONLINE 0 0 0
gptid/0e5ae824-2bda-11e7-a503-002590c8d294 ONLINE 0 0 0
gptid/6402f3f5-2bbe-11e7-a503-002590c8d294 ONLINE 0 0 0

A scrub later ran and just "Repaired: 384K" with no new checksum/read/write errors.

- When we set up autoreplace=on behind FreeNAS's back, we should have only ever "offline" a bad disk and wait for the hot spare to be pulled in, never just"replace" a bad disk when autoreplace=on.
- Hitting replace in GUI without offlining disk first looks like it keeps the disk in the pool, which is scary.
- Those checksum were probably errors that just came up during the resilver process from a very bad disk, but it could be other reasons.

Thanks!
Michael
 
Status
Not open for further replies.
Top