ZFS scrub estimates it has 11873 hours left

Status
Not open for further replies.

MrBerns

Cadet
Joined
Jan 30, 2012
Messages
9
Code:
# zpool status
  pool: tank
 state: ONLINE
 scrub: scrub in progress for 1h37m, 0.01% done, 11873h0m to go
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz2    ONLINE       0     0     0
            ada0p2  ONLINE       0     0     0
            ada1p2  ONLINE       0     0     0
            ada2p2  ONLINE       0     0     0
            ada3p2  ONLINE       0     0     0
            ada4p2  ONLINE       0     0     0
            ada5p2  ONLINE       0     0     0  9.19M repaired

# iostat -x
                        extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
da0        0.9   1.5    10.2     6.4    0  82.4   0
ada0       0.5   0.4    38.1     2.4    0  25.7   1
ada1       0.5   0.3    36.6     2.2    0  26.6   1
ada2       0.5   0.3    38.7     1.8    0  22.8   1
ada3       0.5   0.3    39.6     2.1    0  23.0   1
ada4       0.3   0.4    12.5     2.4    0  26.8   1
ada5       0.3   0.4    12.7     4.3    6 9780.7  99
md0        0.0   0.5     0.1     1.4    0   0.1   0
md1        0.0   0.0     0.1     0.0    0   0.0   0
md2        0.0   3.2     0.1    14.4    0   0.2   0
pass0      0.0   0.0     0.0     0.0    0   0.0   0
pass1      0.0   0.0     0.0     0.0    0   0.0   0
pass2      0.0   0.0     0.0     0.0    0   0.0   0
pass3      0.0   0.0     0.0     0.0    0   0.0   0
pass4      0.0   0.0     0.0     0.0    0   0.0   0
pass5      0.0   0.0     0.0     0.0    0   0.0   0
pass6      0.0   0.0     0.0     0.0    0   0.0   0

# vmstat
 procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr da0 ad0   in   sy   cs us sy id
 0 0 0    786M    14G   231   0   0   1   259   0   0   0   34  928  787  1  0 99


Output from top:
Code:
last pid:  3854;  load averages:  0.00,  0.01,  0.00                                                                                         up 0+01:48:44  15:08:10
48 processes:  1 running, 47 sleeping
CPU:  0.1% user,  0.0% nice,  0.2% system,  0.0% interrupt, 99.7% idle
Mem: 133M Active, 37M Inact, 967M Wired, 468K Cache, 163M Buf, 14G Free
Swap: 12G Total, 12G Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 1816 root          1  44    0  6772K  1248K select  0   0:09  0.00% powerd
 2184 root          7  44    0 66836K  9372K ucond   2   0:07  0.00% collectd
 2062 root          6  45    0   166M 98420K uwait   1   0:05  0.00% python
 2213 www           4  44    0 28272K 12016K nanslp  0   0:03  0.00% transmission-daemon
 2605 www           1  44    0 19328K  4036K kqread  1   0:01  0.00% lighttpd
 1808 root          1  44    0 11780K  2740K select  0   0:00  0.00% ntpd
 1601 root          1  44    0 39208K  6404K select  0   0:00  0.00% nmbd
 2674 root          1  44    0 67232K 25484K wait    0   0:00  0.00% python
 2326 root          1  50    0  7832K  1500K nanslp  0   0:00  0.00% cron
 1998 root          1  44    0 16088K  4540K select  2   0:00  0.00% proftpd
 1482 root          1  44    0 46788K  9396K select  0   0:00  0.00% smbd
 3564 mbernard      1  44    0 33304K  5104K select  1   0:00  0.00% sshd
 3840 root          1  44    0  9224K  2372K CPU2    2   0:00  0.00% top
 1262 root          1  44    0  6904K  1460K select  0   0:00  0.00% syslogd
 3608 root          1  44    0 10172K  3012K pause   2   0:00  0.00% csh
 3562 root          1  49    0 33304K  5068K sbwait  0   0:00  0.00% sshd
  944 root          1  44    0  3204K   716K select  2   0:00  0.00% devd

And in IO mode:
Code:
3;  load averages:  0.01,  0.02,  0.00                                                                                         up 0+01:47:39  15:07:05
48 processes:  1 running, 47 sleeping
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 133M Active, 37M Inact, 967M Wired, 468K Cache, 163M Buf, 14G Free
Swap: 12G Total, 12G Free

  PID USERNAME     VCSW  IVCSW   READ  WRITE  FAULT  TOTAL PERCENT COMMAND
 1816 root           18      0      0      0      0      0   0.00% powerd
 2184 root            0      0      0      0      0      0   0.00% collectd
 2062 root            2      0      0      0      0      0   0.00% python
 2213 www            42      0      0      0      0      0   0.00% transmission-daemon
 2605 www             2      0      0      0      0      0   0.00% lighttpd
 1808 root            2      0      0      0      0      0   0.00% ntpd
 1601 root            0      0      0      0      0      0   0.00% nmbd
 2674 root            0      0      0      0      0      0   0.00% python
 2326 root            0      0      0      0      0      0   0.00% cron
 1998 root            0      0      0      0      0      0   0.00% proftpd
 1482 root            0      0      0      0      0      0   0.00% smbd
 3564 mbernard        1      0      0      0      0      0   0.00% sshd
 1262 root            0      0      0      0      0      0   0.00% syslogd
 3608 root            0      0      0      0      0      0   0.00% csh
 3562 root            0      0      0      0      0      0   0.00% sshd
  944 root            0      0      0      0      0      0   0.00% devd
 3565 mbernard        0      0      0      0      0      0   0.00% bash



I'm fairly certain that a ZFS scrub should not take until October 2013 to complete. I'm a little concerned by the iostat output for ada5, but I don't know if its stats look like that because of the scrub, or if they're an indication of some problem with the drive (and hence why the scrub is taking forever).

For what it's worth, I definitely do not believe I am CPU bound, and I have 16 GB RAM that I don't think ever gets fully utilized. I think my disks are the weakest link in my system.

Any idea what might be wrong, or what I can do to get to the root of the issue?
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I say disk ada5 is bad, or has a serious problem, replace it and that should solve your problem. You'll need to abort the scrub first.

From the command line "zpool scrub -s tank"
 
Status
Not open for further replies.
Top