Scrub Quesiton

Status
Not open for further replies.

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
I just setup a Freenas 8.2 server recently and initiated a disk scrub yesterday. Currently the disk scrub has been going for 16HRs and still says it has 28HRs left until finished. Now I do not know if this is a normal time frame, but almost 40HRs seems quite excessive for a scrub. I figured it would be more like 20-24 HRs. Just wondering what kind of Scrub times other people are experiencing, and I do know that hardware/array size will affect this but im just looking for some ball park figures.

Freenas Server:

Freenas 8.2 Beta2
Intel Celeron G530 Sandy Bridge CPU
Intel H67 Motherboard
16GB DDR3 memory
8x2TB Seagate Green Drives in RaidZ2
Areca 1220 Raid Card setup in JBOD mode
ZFS Volume is about 50% full


Also I find that accessing the server while the disk scrub is active is extremely slow to the point of even browsing to folders takes forever.

Thanks for any input.
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
It seems slow.
Are you using deduplication ?
whats the diskactivity as shown with systat -io ? ( how busy is the disks ?)

Note that resilver speed will be in the same ballpark, so if you care for faster resilver times you need to
shorten scrub times.
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
It seems slow.
Are you using deduplication ?
whats the diskactivity as shown with systat -io ? ( how busy is the disks ?)

Note that resilver speed will be in the same ballpark, so if you care for faster resilver times you need to
shorten scrub times.

Disk activity:

tps - 90-95%
MB/s - 10%

System Activity:

5%

I noticed that its only showing da0-da4 and I have 8 HDD's, is this correct?


Deduplication is OFF.
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
I have it now showing all 8 HDDs after I reissued the command and all of them have the same activity as stated above.
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
Any ideas?
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
It seems slow.
Are you using deduplication ?

Peter, just curious why you asked this since the version of ZFS FreeNAS is using doesn't have Dedupe?

Raidflex, sorry, not trying to hijack your thread. I don't have any ideas, my NAS takes 12-15 hours (5.3TB/Atom CPU/4GB RAM).
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
Peter, just curious why you asked this since the version of ZFS FreeNAS is using doesn't have Dedupe?

Raidflex, sorry, not trying to hijack your thread. I don't have any ideas, my NAS takes 12-15 hours (5.3TB/Atom CPU/4GB RAM).

No problem. I want to setup a weekly scrub as recommended for consumer drives. But when it takes 2 days to do one scrub it doesn't make sense tying up the server that long and basically its unusable when its running.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Did you do a "zpool status -v" from the command line or from the GUI? That might give a clue.
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Sorry, I wasn't clear about that.

What does the output look like?
Has it found any errors?
Does it say disks are offline or anything?
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
Sorry, I wasn't clear about that.

What does the output look like?
Has it found any errors?
Does it say disks are offline of anything?

It says all disks are online and no errors detected.
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
Here is an example:


scrub: scrub in progress for 22h58m, 51.07% done, 22h0m to go
config:

NAME STATE READ WRITE CKS UM
Data ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gptid/e902c40d-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e91172b0-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e91f0e06-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e92cba1e-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e93a03e7-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e94867a0-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e956b27d-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e964d6cf-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0

errors: No known data errors
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
Regarding the systat figures, tps sstands for Transactions Per Second and 90-95 seems low,
likewize is 10MB/s a little low ( unless the disks are unusually slow)
Is the system loaded with other things ? Does zfs get all memory it needs ?
 
Joined
Mar 14, 2012
Messages
14
I have a similar setup, just a mix of Seagate and WD drives and 6x2TB
I just did a scrub.
The estimated time left was way off almost the hole time.

[root@freenas] ~# zpool status -v
pool: data
state: ONLINE
scrub: scrub completed after 27h49m with 0 errors on Thu Mar 15 03:16:46 2012
config:

NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gptid/c2315af9-dafd-11e0-aec7-e06995e518cf ONLINE 0 0 0
ada0p2 ONLINE 0 0 0
gptid/c3217031-dafd-11e0-aec7-e06995e518cf ONLINE 0 0 0
ada5p2 ONLINE 0 0 0 224K repaired
gptid/c474e126-dafd-11e0-aec7-e06995e518cf ONLINE 0 0 0
gptid/c5382338-dafd-11e0-aec7-e06995e518cf ONLINE 0 0 0

errors: No known data errors
[root@freenas] ~#
 
Joined
Mar 14, 2012
Messages
14
Not too hijack the thread, but why aren't FreeNAS showing the speed of the scrub when doing zpool status -v ?
I have seen this while searching for ways to speed up zfs scrub.

Like:
scan: scrub in progress since Sat May 7 08:07:12 2011
182G scanned out of 1.42T at 365M/s, 0h59m to go
0 repaired, 12.44% done

Would be nice to have as an indicator.
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
I just started a scrub on a small ( 4 * 1TB raidz ) hp n36 server where disk is used to about 50%

First , the speed systat shows is "normal":
Disks da0 ada0 ada1 ada2 ada3 md0 md1 6786660 wire
KB/t 0.00 84.04 88.53 84.74 87.93 0.00 0.00 125152 act
tps 0 654 606 649 611 0 0 68352 inact
MB/s 0.00 53.71 52.41 53.68 52.49 0.00 0.00 52 cache
%busy 0 43 49 44 48 0 0

( sorry about the formatting but about 600 tps and 50MB/s )

[root@fnas] ~# zpool status
pool: fnas
state: ONLINE
scrub: scrub in progress for 0h3m, 2.08% done, 2h41m to go
config:

NAME STATE READ WRITE CKSUM
fnas ONLINE 0 0 0
raidz1 ONLINE 0 0 0
gptid/5a4597ca-07ca-11e1-aeb0-d485646aa47c ONLINE 0 0 0
gptid/5b0bd205-07ca-11e1-aeb0-d485646aa47c ONLINE 0 0 0
gptid/5bb8aa85-07ca-11e1-aeb0-d485646aa47c ONLINE 0 0 0
gptid/5c70272b-07ca-11e1-aeb0-d485646aa47c ONLINE 0 0 0

errors: No known data errors
 

raidflex

Guru
Joined
Mar 14, 2012
Messages
531
Scrub is STILL going. Time estimates seem to be right.


root@fileserver] ~# zpool status -v
pool: Data
state: ONLINE
scrub: scrub in progress for 33h39m, 74.89% done, 11h17m to go
config:

NAME STATE READ WRITE CKSUM
Data ONLINE 0 0 0
raidz2 ONLINE 0 0 0
gptid/e902c40d-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e91172b0-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e91f0e06-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e92cba1e-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e93a03e7-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e94867a0-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e956b27d-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0
gptid/e964d6cf-3326-11e1-acd0-e06995ebe0de ONLINE 0 0 0

errors: No known data errors


[root@fileserver] ~# systat -io

* * * * * * * * * * /0 * /1 * /2 * /3 * /4 * /5 * /6 * /7 * /8 * /9 * /10
* * *Load Average * ||

* * * * * /0% */10 */20 */30 */40 */50 */60 */70 */80 */90 */100
cpu *user|XX
* * *nice|
* *system|***X *
interrupt|
* * *idle|******************************************X*


* * * * * /0% */10 */20 */30 */40 */50 */60 */70 */80 */90 */100
da0 * MB/sXX
* * * tps|*********************************************X
da1 * MB/sXX
* * * tps|*********************************************X
da2 * MB/sXX
* * * tps|*********************************************X
da3 * MB/sXX
* * * tps|*********************************************X
da4 * MB/sXX
* * * tps|*********************************************X
da5 * MB/sXX
* * * tps|*********************************************X
da6 * MB/sXX
* * * tps|*********************************************
da7 * MB/sXX
* * * tps|*********************************************X
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
What is systat saying if you go with numbers ( or plain systat -vm )

Anything less then 300-500 tps suggest hardware problems.

Also try 'zpool iostat 3' and show us the figures from a minute.
 
Joined
Mar 14, 2012
Messages
14
Just for comparison

[root@freenas] ~# systat -io



/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average |

/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
cpu user|
nice|
system|XX
interrupt|
idle|*********************************************XX

/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
da0 MB/s
tps|
ada0 MB/s***
tps|******************************************XX185.96
ada1 MB/s***
tps|******************************************XX189.96
ada2 MB/s***
tps|******************************************XX187.96
ada3 MB/s***
tps|******************************************XX190.76
ada4 MB/s***
tps|******************************************XX188.96
md0 MB/s
tps|
md1 MB/s
tps|
md2 MB/s
tps|
ada5 MB/s***
tps|******************************************XX180.16
pass0 MB/s
tps|
pass1 MB/s


#systat -vm
1 users Load 0.23 0.23 0.25 Mar 15 20:16

Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 550372 11484 1434336 16896 692456 count
All 6484340 13820 1075275k 32732 pages
Proc: Interrupts
r p d s w Csw Trp Sys Int Sof Flt cow 10073 total
69 21k 18 400 2077 2085 13 zfod irq0:
ozfod stray irq0
9.2%Sys 0.6%Intr 0.1%User 0.0%Nice 90.2%Idle %ozfod 1388 mvs0 ehci0
| | | | | | | | | | | daefr 2 ehci1 23
===== prcfr 1999 cpu0: time
dtbuf 124699 totfr 163 em0 irq256
Namei Name-cache Dir-cache 205174 desvn react 524 ahci0 257
Calls hits % hits % 63632 numvn pdwak 1999 cpu1: time
188 188 100 51292 frevn pdpgs 1999 cpu3: time
intrn 1999 cpu2: time
Disks da0 ada0 ada1 ada2 ada3 ada4 md0 6743220 wire
KB/t 0.00 100 66.06 70.89 117 104 0.00 271860 act
tps 0 287 442 409 245 278 0 309748 inact
MB/s 0.00 28.10 28.53 28.29 28.01 28.15 0.00 52864 cache
%busy 0 100 86 88 100 98 0 639432 free
230096 buf
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
This system is waiting for disk-io
( if you compare my n36 that maes > 600 tps and the disks are still < 50% busy )
You might rearrange your disks to for more vdev's each with smaller number of disks. ( most likley raidz )
Not a happy thing to do, but you don't have any choice if you want more speed.
 
Status
Not open for further replies.
Top