SOLVED Some issues while expanding my pool (RAIDZ2, 9.3)

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Hello,

Forgive the long post, but I'll give a complete synopsis of what is going on, with all the data I have to hopefully provide sufficient info to have some insight into things...

I've been running a 9.3.x system (FreeNAS-9.3-STABLE-201605170422) for years with no issues. The pool was getting quite full (~92%), and drives have about 50k hours on them, so I elected to replace them. I don't have a spare port in this system (see specs below), and thus, I have been following the process to replace one drive at a time, to eventually get an expanded pool when I finish the replacement.

I've had an issue that's making me a bit nervous. I have all vital data backed up, so losing the pool would be an inconvenience, not a disaster, but I'd really prefer to not have to restore everything from backup.

The pool is a 6 disk raidz2 made of 2TB WD Reds, replacing with 4TB WD Reds. Each replacement drive has been burned in with SMART short/conveyance/long tests, followed by a 2 pass (sequential then random) write/read surface scan using Hard Disk Sentinel in on a Win7 box, then another SMART long scan.

First disk replacement went fine (offlined in GUI, shutdown, remove/replace, boot, replace device, resliver).

Second disk replacement went fine, with the exception that somewhere near the end of the resliver, both new drives threw a checksum error (a single one). These first two replacements are on a highpoint controller, so I figured hmm, maybe something got screwy with some cabling or the controller had an issue or something.

I scrubbed the pool, which had no issues, then cleared the errors ('zpool clear').

Post-scrub, pre-clear zpool status:
Code:
[root@freenas] /mnt/bluemesa/media# zpool status -v
  pool: bluemesa
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 35h49m with 0 errors on Sat Jan  5 05:31:19 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        bluemesa                                        ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/5dce97b0-0d3b-11e9-b5ad-7054d21693cc  ONLINE       0     0     1
            gptid/541ad802-0ec0-11e9-b79b-7054d21693cc  ONLINE       0     0     1
            gptid/ba00b169-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/ba756e6a-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/bae4d348-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/55892d30-c89c-11e2-9452-7054d21693cc  ONLINE       0     0     0

errors: No known data errors


3rd disk (now replacing a disk on the motherboard ports, ada2), something went a bit wonkier. Resliver was fine, but at the end of the resilver, the arrayed showed a small number of checksum errors at the pool level, with one corrupted file. I didn't save the console output for this one, but in the next spoiler tags, you'll see the current status of the pool, which includes the checksum errors and damaged file.

As the file was in a snapshot, I decided I didn't care about that, and would delete it later...I think this was potentially a mistake at this point. I also didn't rescrub the pool, seeing as it had already decided that the damage was done and no other checksum errors at the device level were showing, I figured why bother (maybe another mistake...?).

So, I pressed forward, and offlined another disk (ada3, 4th replacement). The disk went offline, and at the same time, the pool started resilvering again immediately (I had not removed the drive yet, or powered down the server). All of a sudden, my pool had no redundancy (gulp).....

This is where I stand now (you can see the resilver in progress, with one drive offline...it's resilvering to ada2, the 3rd disk I'd replaced, on a motherboard port):

Code:
[root@freenas] ~# zpool status -v
  pool: bluemesa
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Jan  9 22:53:22 2019
        6.37T scanned out of 10.2T at 83.4M/s, 13h26m to go
        1.06T resilvered, 62.33% done
config:

        NAME                                            STATE     READ WRITE CKSUM
        bluemesa                                        DEGRADED     0     0     2
          raidz2-0                                      DEGRADED     0     0     4
            gptid/5dce97b0-0d3b-11e9-b5ad-7054d21693cc  ONLINE       0     0     0
            gptid/541ad802-0ec0-11e9-b79b-7054d21693cc  ONLINE       0     0     0
            gptid/ce054d2f-1118-11e9-9c8a-7054d21693cc  ONLINE       0     0     0  (resilvering)
            6099859449148961522                         OFFLINE      0     0     0  was /dev/gptid/ba756e6a-ae44-11e2-a357-7054d21693cc
            gptid/bae4d348-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/55892d30-c89c-11e2-9452-7054d21693cc  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        bluemesa/media@auto-20190105.0000-2w:/Video/Movies/Finding Nemo (2003)/Finding Nemo (2003).mkv


So, not sure what happened here. Kind of feels like either the pool was unhappy due to preexisting error in the snapshotted file/and/or checksum errors at pool level; or for some reason, when I offlined ada3, it bounced ada2 for some reason momentarily. Either of these feel like I'm toying with pool loss at this point, as there is no redundancy until this resilver completes. Any ideas? Did I make some kind of error in not deleting the file or clearing the error status of the pool before offlining a disk, or is there some other gremlins at play here?

My current plan is to wait until the resilver completes, and then I'll attempt to delete that snapshot. Does this make sense? I've seen some evidence in older forum posts of pools acting strangely with corruption in a snapshot, but could be all anecdotal.

Should I at that point scrub the pool (even with a device missing)? I thought that the resilver should basically touch all the data, so I don't see a reason to unless I misunderstand that point, so I'm thinking not...

Assuming I don't scrub, my plan is that I'll delete that snapshot, do a zpool clear, and then shutdown, pull out the offlined disk (ada3), and replace it. Hopefully at that point, I'm on 5 disks resilvering to the replaced one with no issues...

Really not sure what is going on here, but it does make me a bit nervous. Any input you all have would be appreciated. I am planning to upgrade to 11.2 when I get through this pool expansion, 9.3 has been solid for years for me but it's past due even for my uses (light media serving, storage, local backups.

Thanks!

Camcontrol output:
Code:

[root@freenas] ~# camcontrol devlist
<WDC WD40EFRX-68N32N0 82.00A82>    at scbus0 target 0 lun 0 (pass0,ada0)
<WDC WD40EFRX-68N32N0 82.00A82>    at scbus1 target 0 lun 0 (pass1,ada1)
<WDC WD40EFRX-68N32N0 82.00A82>    at scbus2 target 0 lun 0 (pass2,ada2)
<WDC WD20EFRX-68AX9N0 80.00A80>    at scbus3 target 0 lun 0 (pass3,ada3)
<WDC WD20EFRX-68AX9N0 80.00A80>    at scbus4 target 0 lun 0 (pass4,ada4)
<WDC WD20EFRX-68AX9N0 80.00A80>    at scbus5 target 0 lun 0 (pass5,ada5)
<MUSHKIN MKNUFDMH8GB PMAP>         at scbus7 target 0 lun 0 (pass6,da0)


glabel status output:
Code:
[root@freenas] ~# glabel status
                                      Name  Status  Components
gptid/5dce97b0-0d3b-11e9-b5ad-7054d21693cc     N/A  ada0p2
gptid/541ad802-0ec0-11e9-b79b-7054d21693cc     N/A  ada1p2
gptid/ba756e6a-ae44-11e2-a357-7054d21693cc     N/A  ada3p2
gptid/bae4d348-ae44-11e2-a357-7054d21693cc     N/A  ada4p2
gptid/55892d30-c89c-11e2-9452-7054d21693cc     N/A  ada5p2
gptid/abc9b96a-99cc-11e4-8b6f-7054d21693cc     N/A  da0p1
gptid/ce054d2f-1118-11e9-9c8a-7054d21693cc     N/A  ada2p2
gptid/ba660cbb-ae44-11e2-a357-7054d21693cc     N/A  ada3p1
[root@freenas] ~# gpart status
  Name  Status  Components
ada0p1      OK  ada0
ada0p2      OK  ada0
ada1p1      OK  ada1
ada1p2      OK  ada1
ada3p1      OK  ada3
ada3p2      OK  ada3
ada4p1      OK  ada4
ada4p2      OK  ada4
ada5p1      OK  ada5
ada5p2      OK  ada5
da0p1      OK  da0
da0p2      OK  da0
ada2p1      OK  ada2
ada2p2      OK  ada2
[root@freenas] ~#


Specs:
Code:
Case:   Lian Li PC-Q25B
PSU:    SeaSonic SS-300ET (OEM)
MB:     Intel DBS1200KPR
RAM:    16GB Kingston ECC ValueRAM (KVR13E9K2/16I - 2x8GB kit)
CPU:    Intel Celeron G555
Flash:  Mushkin Mullholland 8GB (USB FreeNAS boot)
HDDs:   6x Western Digital 2TB Red (WD20EFRX)
HBA:    Highpoint Rocket 620 (2 ports)
HSF:    Noctua NH-L9i
Fan #1: Noctua NF-P14 FLX (Intake 140mm)
Fan #2: Noctua NF-S12B FLX (Outflow 120mm)
 
Last edited by a moderator:

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Well, the resilver completed, with the error still showing in the file and the 2/4 checksum errors.

I deleted the snapshot, which made the filename in the zpool status -v display be replaced by some metadata or other indicators in brackets.

I then cleared the pool error status (zpool clear bluemesa).

And...ada2 started resilvering AGAIN immediately. Something does NOT seem right about this...any ideas?

Current status:
Code:
[root@freenas] ~# zpool status -v
  pool: bluemesa
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Jan 11 11:23:11 2019
        65.9G scanned out of 10.2T at 141M/s, 20h57m to go
        11.0G resilvered, 0.63% done
config:

        NAME                                            STATE     READ WRITE CKSUM
        bluemesa                                        DEGRADED     0     0     0
          raidz2-0                                      DEGRADED     0     0     0
            gptid/5dce97b0-0d3b-11e9-b5ad-7054d21693cc  ONLINE       0     0     0
            gptid/541ad802-0ec0-11e9-b79b-7054d21693cc  ONLINE       0     0     0
            gptid/ce054d2f-1118-11e9-9c8a-7054d21693cc  ONLINE       0     0     0  (resilvering)
            6099859449148961522                         OFFLINE      0     0     0  was /dev/gptid/ba756e6a-ae44-11e2-a357-7054d21693cc
            gptid/bae4d348-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/55892d30-c89c-11e2-9452-7054d21693cc  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        <0x126>:<0x2b358>
 
Last edited by a moderator:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Metadata corruption is a bad sign. You can try deleting the file, which might fix it, but I don't like the fact that it happened at all.

These first two replacements are on a highpoint controller
That might very well be the cause of your problem. Definitely get rid of it ASAP and replace it with an LSI SAS HBA.
 

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Ok, so the scrub that occured right after I did the zpool clear has now completed. You can see that it found the same file corrupted again, but in the next snapshot, which makes sense (given what I understand about snapshots at least). The "<0x126>:<0x2b358>" reference disappeared when the scrub completed, for a while, both were shown together. I'm not sure if this was metadata, or just some kind of reference to the file that persisted after deletion of the snapshot until the resilver completed...

Code:
[root@freenas] ~# zpool status -v
  pool: bluemesa
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: resilvered 1.65T in 32h49m with 1 errors on Sat Jan 12 20:12:38 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        bluemesa                                        DEGRADED     0     0     1
          raidz2-0                                      DEGRADED     0     0     2
            gptid/5dce97b0-0d3b-11e9-b5ad-7054d21693cc  ONLINE       0     0     0
            gptid/541ad802-0ec0-11e9-b79b-7054d21693cc  ONLINE       0     0     0
            gptid/ce054d2f-1118-11e9-9c8a-7054d21693cc  ONLINE       0     0     0
            6099859449148961522                         OFFLINE      0     0     0  was /dev/gptid/ba756e6a-ae44-11e2-a357-7054d21693cc
            gptid/bae4d348-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/55892d30-c89c-11e2-9452-7054d21693cc  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        bluemesa/media@auto-20190106.0000-2w:/Video/Movies/Finding Nemo (2003)/Finding Nemo (2003).mkv


The system is running now, though I'm seeing some odd non-uniform disk behavior in the gui that seems...not quite right either. ada2, the disk which finished resilvering, is getting activity missing many of the reads that are happening across the pool right now (which is playing some video files among other things). I am guessing this is because the system has been serving the same video files for a while and maybe it was already setup to read off the arc/those disks before the resilver completed? ada0 and ada5 look the same as 1 and 4. ada3 is the currently offlined disk. Kind of guessing here, I'll see if the behavior persists. Is there any other post-resilver zfs/freenas activity of any kind expected that would cause this read imbalance? I'm used to seeing write and reads uniform across all disks pretty much at all times.

1547366164734.png


My current plan is to remove some data from the pool (just to make resilvering faster really), then blow away all the snapshots that have that bad file it them. Then, if everything is looking good tomorrow, I'll shutdown and replace the currently offlined ada3 drive. *fingers crossed*

I think I'll run a short and long test of ada2 overnight as well.

Ericlowe, is that Highpoint Rocket 620 a part with known issues? I recall some HP controllers being poorly regarded, but I did a ton of research back when I built this system and I'd be surprised if I picked one that had existent issues.

Thanks!
 

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Yeah, something still seems goofy with ada2. Started a big transfer to a desktop machine, ada2 not participating in array read at all. gstat output agrees, showing no reads at all.

1547368574076.png
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
All highpoints are to be avoided. Not a single one of them has proven reliable in FreeNAS in all these years. Even when they specifically tried.
 

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Update

After testing ada2 overnight (short and long smart tests, passed), I decided to shutdown the system and install a new disk into ada3, which has been offline since this saga started. I blew away all the snapshots with that corrupt file in it before doing that.

After rebooting, ada2 started resilvering again immediately. I then replaced ada3 with the newly installed drive, and it also started resilvering. So, two disks resilvered for just over 33 hours. At this point, the pool appears healthy, but it still looks like the array isn't using ada2 or ada3 (the newly resilvered disks) for any major read activity. I can hear lots of activity on the array, but I have no idea if this is just the newer drives being louder and so I'm hearing what I couldn't before, or something else.

After the resilver, it showed that it found the same corrupted file, but this time it's the actual file, not the snapshotted version. So I deleted that file as well and will restore it from backup at a later point.

Current status:

Code:
[root@freenas] /mnt/bluemesa/media# zpool status -v
  pool: bluemesa
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: resilvered 3.16T in 33h33m with 1 errors on Mon Jan 14 21:42:19 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        bluemesa                                        ONLINE       0     0     1
          raidz2-0                                      ONLINE       0     0     2
            gptid/5dce97b0-0d3b-11e9-b5ad-7054d21693cc  ONLINE       0     0     0
            gptid/541ad802-0ec0-11e9-b79b-7054d21693cc  ONLINE       0     0     0
            gptid/ce054d2f-1118-11e9-9c8a-7054d21693cc  ONLINE       0     0     0
            gptid/ef455b0d-176e-11e9-906e-7054d21693cc  ONLINE       0     0     0
            gptid/bae4d348-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/55892d30-c89c-11e2-9452-7054d21693cc  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        bluemesa/media:<0x2b358>


1547534635398.png


My current plan is to leave things for 24H or so to see how things perform, then perform a scrub if it looks like ada2/3 are still not participating in array reads.

Any input anyone has would be welcome!
 

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Ok, so after the above, I wrote some new data to the pool, and read it back. The ada2/3 did receive this data and read it back. I read existing data off of the pool, and ada2/3 show NO read activity.

I kicked off a scrub, and ada2/3 are not being read from at all...

Is this in any way expected behavior????

Current status:
Code:

[root@freenas] ~# zpool status -v
  pool: bluemesa
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub in progress since Tue Jan 15 08:47:09 2019
        452G scanned out of 9.99T at 255M/s, 10h53m to go
        0 repaired, 4.42% done
config:

        NAME                                            STATE     READ WRITE CKSUM
        bluemesa                                        ONLINE       0     0     1
          raidz2-0                                      ONLINE       0     0     2
            gptid/5dce97b0-0d3b-11e9-b5ad-7054d21693cc  ONLINE       0     0     0
            gptid/541ad802-0ec0-11e9-b79b-7054d21693cc  ONLINE       0     0     0
            gptid/ce054d2f-1118-11e9-9c8a-7054d21693cc  ONLINE       0     0     0
            gptid/ef455b0d-176e-11e9-906e-7054d21693cc  ONLINE       0     0     0
            gptid/bae4d348-ae44-11e2-a357-7054d21693cc  ONLINE       0     0     0
            gptid/55892d30-c89c-11e2-9452-7054d21693cc  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        bluemesa/media:<0x2b358>

1547572676461.png
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I kicked off a scrub, and ada2/3 are not being read from at all...

Is this in any way expected behavior????
No, not if they're part of the pool being scrubbed.

Well, technically, if they had been added to the pool later as a new vdev they would probably see less activity, at least at first.
 

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Update for anyone following or having similar issues in the future.

ada2/3 never really did much through that scrub. However, after the scrub completed, they were fully participating in array reads and write of even old files. The zpool status at the end of the scrub still showed the old checksum errors at the pool/vdev level, but no known data errors (since I had deleted the corrupted file).

I did a zpool clear, and the array seemed to be fine, no new resilvering or anything. I scrubbed again, this time all disks showed read activity throughout the scrub, and it completed with no errors/repairs. I rebooted, and the system came back up just fine, no resilvering restarted or anything.

I then offlined the next disk in my array replacement plan (ada4), and replaced it; it's now resilvering, and everything seems fine.

I am still not 100% sure what was going on, but something about the presence of that corrupted file and/or the checksum errors at array level was preventing the resilver from being fully completed until after the file was removed and the array scrubbed. In my research, I did see some anecdotal statements that zfs will not/cannot fully complete a resilver when there are pool or vdev level errors on the array, so that could have been the issue, and deletion of that corrupted file and scrub made those new disks fully trusted.

In any case, things are looking pretty much back to normal. Assuming the last two disk replacements go well, I'll be upgrading to freenas 11.2 and getting a replacement hba for this highpoint controller. Looks like the LSI SAS2008 or SAS2308 based cards are still the go-to recommendation there, yes?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Looks like the LSI SAS2008 or SAS2308 based cards are still the go-to recommendation there, yes?
The SAS3008 cards are a safe choice too.
 

SkyMonkey

Contributor
Joined
Mar 13, 2013
Messages
102
Final update (a few weeks late):

Last two drive replacements went fine, and array is running normally with the expanded size.

Must have been some kind of persistent issue due to the checksum errors that occured when resilvering ada2. Once I'd cleared those and the bad files, and allowed things to scrub, everything was back to normal.
 

Matt_G

Explorer
Joined
Jan 24, 2016
Messages
65
Thanks for going to the trouble of documenting all this.
 
Top