Very slow disk replacement/resilver (RAIDZ2)

Status
Not open for further replies.

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Hey all! I had a disk that was showing some offline uncorrectable sectors. I ran the long smartctl test and it failed, so I RMA'd it and have replaced it in my pool. I used the GUI, and followed the instructions for my version (9.2.1.3 - I want to upgrade, but I want a healthy pool first).

Things started off reasonably well - scanning at around 100-120 MB/s, looking at about 31 hours to finish the resilver. However, in the ensuing hours, it has slowed to a near standstill (scanning a couple MB/s, looking at 130+ hours to complete).

So where should I start looking to troubleshoot this?

Code:
[root@delta] ~# zpool status
  pool: tank
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jul 15 16:21:21 2014
        474G scanned out of 12.6T at 26.9M/s, 130h57m to go
        78.8G resilvered, 3.68% done
config:

    NAME                                              STATE     READ WRITE CKSUM
    tank                                              DEGRADED     0     0     0
      raidz2-0                                        DEGRADED     0     0     0
        gptid/f645c671-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
        gptid/f6959517-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
        gptid/f6e96295-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
        replacing-3                                   OFFLINE      0     0     0
          14595285845210114515                        OFFLINE      0     0     0  was /dev/gptid/f7547512-dfe0-11e2-b96b-50465d6afb74
          gptid/95942e35-0c5d-11e4-934c-50465d6afb74  ONLINE       0     0     0  (resilvering)
        gptid/f7c2859e-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
        gptid/f8333b58-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
    cache
      gptid/7cddd7fe-bb49-11e3-b237-50465d6afb74      ONLINE       0     0     0

errors: No known data errors
[root@delta] ~#


Here's the systat -vm output. You can see that each disk is reading <1MB/s.

Code:
    2 users    Load  0.03  0.07  0.07                  Jul 15 21:28

Mem:KB    REAL            VIRTUAL                       VN PAGER   SWAP PAGER
        Tot   Share      Tot    Share    Free           in   out     in   out
Act  302592   29992  2808616    48016  943592  count
All 6187856   32516 1076965k    84876          pages
Proc:                                                            Interrupts
  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt        cow    1549 total
          1  76      6202   80 1479 1549 1031             zfod        attimer0 0
                                                          ozfod    35 ata1 15
0.4%Sys   0.0%Intr  0.0%User  0.0%Nice 99.6%Idle        %ozfod     2 ohci0 ohci
|    |    |    |    |    |    |    |    |    |    |       daefr       ehci0 17
                                                          prcfr     4 atapci0+++
                                           dtbuf        2 totfr   432 em0 20
Namei     Name-cache   Dir-cache    202416 desvn          react    67 ahci1 22
   Calls    hits   %    hits   %     17586 numvn          pdwak  1009 hpet0:t0
    4431    4430 100                  7995 frevn          pdpgs
                                                          intrn
Disks   md0   md1   md2  ada0  ada1  ada2  ada3   6369864 wire
KB/t   0.00  0.00  0.00 75.61 35.57 35.24 35.70    298812 act
tps       0     0     3     4    19    19    19    228628 inact
MB/s   0.00  0.00  0.00  0.28  0.65  0.65  0.65           cache
%busy     0     0     0     0    13    13    13    943592 free
                                                   155396 buf


Any pointers?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If it's still going, I'd let it go on. But you might have another bad drive.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Yeah, it's going, so I'm just letting it ride. The scan speed seems to have evened out at about 30 MB/s. Still seems almost comically slow - 5 days to resilver a 3TB disk? - but I'm a patient man who still has redundancy in the pool (and a UPS), so I'll be okay.

smartctl reports no other errors on any disk, and the error counters and /var/log/messages are clean, so if there's another bad drive, it's hiding in a way that I can't break through its stealth.

Code:
[root@delta] ~# zpool status
  pool: tank
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jul 15 16:21:21 2014
        1.86T scanned out of 12.6T at 30.4M/s, 102h41m to go
        316G resilvered, 14.77% done
config:

        NAME                                              STATE     READ WRITE CKSUM
        tank                                              DEGRADED     0     0     0
          raidz2-0                                        DEGRADED     0     0     0
            gptid/f645c671-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
            gptid/f6959517-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
            gptid/f6e96295-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
            replacing-3                                   OFFLINE      0     0     0
              14595285845210114515                        OFFLINE      0     0     0  was /dev/gptid/f7547512-dfe0-11e2-b96b-50465d6afb74
              gptid/95942e35-0c5d-11e4-934c-50465d6afb74  ONLINE       0     0     0  (resilvering)
            gptid/f7c2859e-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
            gptid/f8333b58-dfe0-11e2-b96b-50465d6afb74    ONLINE       0     0     0
        cache
          gptid/7cddd7fe-bb49-11e3-b237-50465d6afb74      ONLINE       0     0     0

errors: No known data errors
[root@delta] ~#
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Do you have SMART tests scheduled? If not (you should), you'll definitely want to run a long test on all the other drives once this is over.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Well, I was totally wrong. It didn't take 5 days. It only took 4 days, 4 hours, and 43 minutes.

Code:
[root@delta] ~# zpool status
  pool: tank
state: ONLINE
  scan: resilvered 2.09T in 100h43m with 0 errors on Sat Jul 19 21:05:18 2014
config:

    NAME                                            STATE     READ WRITE CKSUM
    tank                                            ONLINE       0     0     0
      raidz2-0                                      ONLINE       0     0     0
        gptid/f645c671-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
        gptid/f6959517-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
        gptid/f6e96295-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
        gptid/95942e35-0c5d-11e4-934c-50465d6afb74  ONLINE       0     0     0
        gptid/f7c2859e-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
        gptid/f8333b58-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
    cache
      gptid/7cddd7fe-bb49-11e3-b237-50465d6afb74    ONLINE       0     0     0

errors: No known data errors
[root@delta] ~#


And now I'll run the SMART tests on all the other drives, and tomorrow upgrade to 9.2.1.6. Wahoo.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Followup on this: Per the thread I posted in the Performance forum, I had two of my drives attached to a bus configured as legacy IDE instead of AHCI - a situation I have now fixed. As (bad) luck would have it, I had another drive failure today, so I get to replicate this resilver experiment. I'm now seeing pretty much the same behavior - started off super fast, but soon slowed to around 25-30MB/s scan speed.

gstat shows that the target drive is the problem - it simply can't maintain the same write speed as the rest of the pool can read:

Code:
dT: 10.001s  w: 10.000s  filter: gpt
L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0| gptid/4c6e0af2-1083-11e4-8b26-50465d6afb74
    0    142    137   6294    1.1      5     56    0.3    9.3| gptid/f645c671-dfe0-11e2-b96b-50465d6afb74
    0    140    135   6284    1.3      5     57    0.3    9.6| gptid/f6959517-dfe0-11e2-b96b-50465d6afb74
    0    135    130   6291    1.3      4     57    0.3    9.8| gptid/f6e96295-dfe0-11e2-b96b-50465d6afb74
    0    119    114   6299    2.5      5     56    3.1   17.0| gptid/95942e35-0c5d-11e4-934c-50465d6afb74
    0    107    101   6329    3.1      5     53    1.8   18.1| gptid/f8333b58-dfe0-11e2-b96b-50465d6afb74
    1     90      0      0    0.0     89   6004    8.9   93.9| gptid/3fddbb0d-23fe-11e4-b26d-50465d6afb74


89 write IOPS (@64KB) is about what I would expect based on benchmarks, but for some reason this particular assembly code doesn't have NCQ, which would definitely impact write performance. Not sure why the operation would start off at such breakneck speeds and then drop off a cliff, but maybe the data is more sequential in the early part of the filesystem? This seems like an area where traditional RAID would kick zfs's ass, since its rebuild operations are purely sequential (on a quiescent array).

My takeaway is this: consumer drives are not great performers, and can be pathological in the right circumstances. Also, just because a drive in a factory external enclosure has the same model number as the bare retail drive doesn't mean they're actually the same!
 
Last edited:

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Final result - this time was indeed faster. Bottom line, the resilver in July ran at ~21.25 GB/hr (100.75 hours for 2140 GB), and this one ran at ~24.6 GB/hr (79.1 hours for 1945 GB). That's an incremental improvement, of which I approve, but it's still pretty durn slow on average. Oh well, that's what RAIDZ2 is for, right?

Code:
[root@delta] ~# zpool status
  pool: tank
state: ONLINE
  scan: resilvered 1.90T in 79h8m with 0 errors on Mon Aug 18 01:07:34 2014
config:

    NAME                                            STATE     READ WRITE CKSUM
    tank                                            ONLINE       0     0     0
      raidz2-0                                      ONLINE       0     0     0
        gptid/f645c671-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
        gptid/f6959517-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
        gptid/f6e96295-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0
        gptid/95942e35-0c5d-11e4-934c-50465d6afb74  ONLINE       0     0     0
        gptid/3fddbb0d-23fe-11e4-b26d-50465d6afb74  ONLINE       0     0     0
        gptid/f8333b58-dfe0-11e2-b96b-50465d6afb74  ONLINE       0     0     0

errors: No known data errors
[root@delta] ~#
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
What's your hardware specs? That seems horribly slow.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
It's all consumer stuff:

Motherboard: Asus M5A78L-M LX+
CPU: AMD Phenom II X6 1045T (6-core 2.7 GHz)
RAM: 16GB G.Skill DDR3-1600
HDD: RAIDZ2 with 6x Seagate ST3000DM001 (3 support NCQ, 3 do not, due to different PCB assemblies)
NIC: Intel EXPI9301CT (PCIe 1GbE)
OS: FreeNAS 9.2.1.7, bare metal

I get scrub speeds (without resilvering) of 300 MB/s, and I can sustain 60-65 MB/s over NFS and CIFS (limited by the client side, I believe - I get similar results from an SSD stripe - it's fast enough, so I haven't really dug into it).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Ah. While I wouldn't expect your consumer stuff to cause your problems I won't lie either.. I rapidly lost interest because it's AMD and they can be quirky with FreeBSD and FreeNAS. In fact, I don't know if I could say for reasonable certainty that your issues aren't completely normal for your hardware. If you were on a Supermicro X9SCM like me I'd think that something is wrong though. Going with AMD turns a situation that is almost abnormal to "possibly normal for the hardware".
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Got any references for the AMD weirdness? I'd like to learn more about that.
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
Just do a Google search for "site:forums.freenas.org cyberjock amd tirade" (without the quotes).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
LOL gpsguy. That's not too far from the truth. But usually those threads start with "So I got this AMD that won't boot" or similar so it's not like I'm making accusations that aren't backed up by 100+ threads in the forum.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Okay, that's cool, but it sounds like you've got some definitive information about AMD CPU performance and FreeBSD somewhere - can you point me to that? I did a couple quick Google searches and didn't come up with much. My stuff boots just fine, but I'm curious now about the potential resilver performance linkage to the CPU.
 
Last edited:

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
I highly doubt it's a CPU architecture issue honestly, unless ZFS resilvering process is designed to heavily use Intel set of instructions. And even then, that would probably only visibly come into play if you were CPU bound. Check your CPU usage, it's most likely barely breaking a sweat. Its probably something else in your setup though I'm not particularly sure what.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Okay, that's cool, but it sounds like you've got some definitive information about AMD CPU performance and FreeBSD somewhere - can you point me to that? I did a couple quick Google searches and didn't come up with much. My stuff boots just fine, but I'm curious now about the potential resilver performance linkage to the CPU.


I'm sorry I won't rehash AMD today.. but here's a post from a while back that explained in as much detail as I feel like explaining in the forums...

http://forums.freenas.org/index.php?threads/tp-link-tg-3269-any-advice.17311/#post-91707

http://forums.freenas.org/index.php...-support-on-msi-e350ia-e44.20445/#post-116945
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
Oh, okay. I can't argue with your thoughts on relative investment and developer community size between Intel and AMD, but you made a pretty definitive statement about AMD CPU's being "quirky" with FreeBSD. Since that was so definitive, I thought you might have some kind of a reference or cite (other than yourself) to describe what those quirks are.

Not asking you to rehash at all! I read those posts, and I get where you're coming from. Just wanted more detail on the "quirks" you mentioned, that's all, and how they might be germane to my issue. But that's cool, I'll see if I can find it on my own.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah, I wish I had more solid details. I know the FreeBSD forums have lots of the same problems. And it's a situation where everyone just says "not something we can fix" as AMD(and the other hardware on AMD boards) is generally purchased from companies that are less-than-responsive to requests for information from open source communities.
 

deafen

Explorer
Joined
Jan 11, 2014
Messages
71
I highly doubt it's a CPU architecture issue honestly, unless ZFS resilvering process is designed to heavily use Intel set of instructions. And even then, that would probably only visibly come into play if you were CPU bound. Check your CPU usage, it's most likely barely breaking a sweat. Its probably something else in your setup though I'm not particularly sure what.
That was my thought as well, but for a minute it sounded like there might be something specific. Oh well. In any case, it's done now, and performance is otherwise acceptable, so I'm just going to live with it. Bigger fish to fry. (The fact that I've got dual parity makes me a lot calmer about it.)
 

Starpulkka

Contributor
Joined
Apr 9, 2013
Messages
179
Looks like hardware borblem..
I did replace one faulted drive in raidz2-0 zpool showed 5.91T at 510M/s speed with lz compression on, resilver only took some hours to complete, yes i know its slow as hell. But what do you can expect from amd right? (im currently bought intel stuff and ibm raid cards. but looks like i put my hands in * again going with intel, ill brobably ask help in this forum later.)
 
Status
Not open for further replies.
Top