Resource icon

Hard Drive Burn-In Testing - Discussion Thread

phier

Patron
Joined
Dec 4, 2012
Messages
398
@Redcoat
yes, i executed it but there are not much details/progress ..

1654463659780.png
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,924
@Redcoat
yes, i executed it but there are not much details/progress ..
So how does your question to @jgreco:

@jgreco
could you please advise how can one burn-in 18TB drive?

reflect the issue of "lack of details/progress" that seems to be your real question? For instance, how long had it been running when you took your screen shot?

You seem to be raising some issues that will be of interest to many, but your questions often lack the specificity to encourage people to respond to them IMHO.
 

phier

Patron
Joined
Dec 4, 2012
Messages
398
i dont know... i thought someone can advice on good howto for burn-in of 18+TB drives....

well the tool is still in state
Awaiting completion: initial parallel array read

there would be nice to see progress or at least know what exactly is happening so i can check in ps axu if process is running or is stuck. Its already in that state ~14hours.


sometimes its hard to provide specificity... i am trying my best.


so i was looking into the ps auwww and i can see these 2 processes:
root 2073 1.0 0.0 15424 3684 3 DLC+ 11:56 18:58.24 dd if=/dev/ada1 of=/dev/null bs=1048576
root 2071 0.7 0.0 15424 3684 3 DLC+ 11:56 18:45.66 dd if=/dev/ada0 of=/dev/null bs=1048576


maybe it would be good point to add progress into each dd command in the script?
 
Last edited:

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,924
Thanks for understanding the issue.

Note that joe greco let us know last evening that he is to undergo heart surgery today. So we might expect some delay in getting comments from him here in the forums...
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,946
I wondered why he had been a bit quiet. I wish him well
 

phier

Patron
Joined
Dec 4, 2012
Messages
398
plus maybe there is one more small bug

Performing initial parallel seek-stress array read
Mon Jun 6 10:46:41 PDT 2022
The disk ada0 appears to be 17166336 MB.
Disk is reading at about 226 MB/sec
This suggests that this pass may take around 1269 minutes


Serial Parall % of
Disk Disk Size MB/sec MB/sec Serial
------- ---------- ------ ------ ------
ada0 17166336MB 261 230 88
ada1 17166336MB 271 220 81

Awaiting completion: initial parallel seek-stress array read


It says details (bold) about one drive even test is executed on 2 drives.

Maybe there are more bugs don't know, as currently same operation is executed over ada0 only,
root 11 198.2 0.0 0 32 - RNL Sun11 4531:05.68 [idle]
root 17287 0.9 0.0 15424 3684 3 DLC+ 10:48 13:23.34 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17269 0.8 0.0 15424 3684 3 DLC+ 10:47 13:21.20 dd if=/dev/ada0 of=/dev/null bs=1048576


root 1737 0.0 0.0 15132 5148 3 Is Sun11 0:00.05 -/usr/local/bin/zsh
root 1949 0.0 0.0 13680 3436 3 I+ Sun11 0:00.02 sh solnet-array-test-v2.sh
root 17253 0.0 0.0 13680 3428 3 I+ 10:46 0:00.00 sh solnet-array-test-v2.sh
root 17254 0.0 0.0 13680 3428 3 I+ 10:46 0:00.00 sh solnet-array-test-v2.sh
root 17256 0.0 0.0 15424 3684 3 DLC+ 10:46 1:24.39 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17257 0.0 0.0 15424 3684 3 DLC+ 10:46 3:09.14 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17266 0.0 0.0 13680 3428 3 I+ 10:47 0:00.00 sh solnet-array-test-v2.sh
root 17267 0.0 0.0 13680 3428 3 I+ 10:47 0:00.00 sh solnet-array-test-v2.sh
root 17270 0.0 0.0 15424 3684 3 DLC+ 10:47 3:09.20 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17284 0.0 0.0 13680 3428 3 I+ 10:48 0:00.00 sh solnet-array-test-v2.sh
root 17285 0.0 0.0 13680 3428 3 I+ 10:48 0:00.00 sh solnet-array-test-v2.sh
root 17288 0.0 0.0 15424 3684 3 DLC+ 10:48 2:48.97 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17297 0.0 0.0 13680 3428 3 I+ 10:49 0:00.00 sh solnet-array-test-v2.sh
root 17298 0.0 0.0 13680 3428 3 I+ 10:49 0:00.00 sh solnet-array-test-v2.sh
root 17300 0.0 0.0 15424 3684 3 DLC+ 10:49 1:24.01 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17301 0.0 0.0 15424 3684 3 DLC+ 10:49 2:47.37 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17314 0.0 0.0 13680 3432 3 I+ 10:50 0:00.00 sh solnet-array-test-v2.sh
root 17315 0.0 0.0 13680 3432 3 I+ 10:50 0:00.00 sh solnet-array-test-v2.sh
root 17317 0.0 0.0 15424 3684 3 DLC+ 10:50 1:22.96 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17318 0.0 0.0 15424 3684 3 DLC+ 10:50 2:41.96 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17328 0.0 0.0 13680 3432 3 I+ 10:51 0:00.00 sh solnet-array-test-v2.sh
root 17329 0.0 0.0 13680 3432 3 I+ 10:51 0:00.00 sh solnet-array-test-v2.sh
root 17330 0.0 0.0 15424 3684 3 DLC+ 10:51 1:19.41 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17332 0.0 0.0 15424 3684 3 DLC+ 10:51 2:37.11 dd if=/dev/ada1 of=/dev/null bs=1048576
root 1740 0.0 0.0 15132 5056 4 Is+ Sun11 0:00.02 -/usr/local/bin/zsh
root 1743 0.0 0.0 14888 4804 5 Rs Sun11 0:00.03 -/usr/local/bin/zsh
 
Last edited:

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,450
plus maybe there is one more small bug

Performing initial parallel seek-stress array read
Mon Jun 6 10:46:41 PDT 2022
The disk ada0 appears to be 17166336 MB.
Disk is reading at about 226 MB/sec
This suggests that this pass may take around 1269 minutes


Serial Parall % of
Disk Disk Size MB/sec MB/sec Serial
------- ---------- ------ ------ ------
ada0 17166336MB 261 230 88
ada1 17166336MB 271 220 81

Awaiting completion: initial parallel seek-stress array read


It says details (bold) about one drive even test is executed on 2 drives.

Maybe there are more bugs don't know, as currently same operation is executed over ada0 only,
root 11 198.2 0.0 0 32 - RNL Sun11 4531:05.68 [idle]
root 17287 0.9 0.0 15424 3684 3 DLC+ 10:48 13:23.34 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17269 0.8 0.0 15424 3684 3 DLC+ 10:47 13:21.20 dd if=/dev/ada0 of=/dev/null bs=1048576


root 1737 0.0 0.0 15132 5148 3 Is Sun11 0:00.05 -/usr/local/bin/zsh
root 1949 0.0 0.0 13680 3436 3 I+ Sun11 0:00.02 sh solnet-array-test-v2.sh
root 17253 0.0 0.0 13680 3428 3 I+ 10:46 0:00.00 sh solnet-array-test-v2.sh
root 17254 0.0 0.0 13680 3428 3 I+ 10:46 0:00.00 sh solnet-array-test-v2.sh
root 17256 0.0 0.0 15424 3684 3 DLC+ 10:46 1:24.39 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17257 0.0 0.0 15424 3684 3 DLC+ 10:46 3:09.14 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17266 0.0 0.0 13680 3428 3 I+ 10:47 0:00.00 sh solnet-array-test-v2.sh
root 17267 0.0 0.0 13680 3428 3 I+ 10:47 0:00.00 sh solnet-array-test-v2.sh
root 17270 0.0 0.0 15424 3684 3 DLC+ 10:47 3:09.20 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17284 0.0 0.0 13680 3428 3 I+ 10:48 0:00.00 sh solnet-array-test-v2.sh
root 17285 0.0 0.0 13680 3428 3 I+ 10:48 0:00.00 sh solnet-array-test-v2.sh
root 17288 0.0 0.0 15424 3684 3 DLC+ 10:48 2:48.97 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17297 0.0 0.0 13680 3428 3 I+ 10:49 0:00.00 sh solnet-array-test-v2.sh
root 17298 0.0 0.0 13680 3428 3 I+ 10:49 0:00.00 sh solnet-array-test-v2.sh
root 17300 0.0 0.0 15424 3684 3 DLC+ 10:49 1:24.01 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17301 0.0 0.0 15424 3684 3 DLC+ 10:49 2:47.37 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17314 0.0 0.0 13680 3432 3 I+ 10:50 0:00.00 sh solnet-array-test-v2.sh
root 17315 0.0 0.0 13680 3432 3 I+ 10:50 0:00.00 sh solnet-array-test-v2.sh
root 17317 0.0 0.0 15424 3684 3 DLC+ 10:50 1:22.96 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17318 0.0 0.0 15424 3684 3 DLC+ 10:50 2:41.96 dd if=/dev/ada1 of=/dev/null bs=1048576
root 17328 0.0 0.0 13680 3432 3 I+ 10:51 0:00.00 sh solnet-array-test-v2.sh
root 17329 0.0 0.0 13680 3432 3 I+ 10:51 0:00.00 sh solnet-array-test-v2.sh
root 17330 0.0 0.0 15424 3684 3 DLC+ 10:51 1:19.41 dd if=/dev/ada0 of=/dev/null bs=1048576
root 17332 0.0 0.0 15424 3684 3 DLC+ 10:51 2:37.11 dd if=/dev/ada1 of=/dev/null bs=1048576
root 1740 0.0 0.0 15132 5056 4 Is+ Sun11 0:00.02 -/usr/local/bin/zsh
root 1743 0.0 0.0 14888 4804 5 Rs Sun11 0:00.03 -/usr/local/bin/zsh
All the best to Joe Greco.

When I performed my 18TB burning test, I was monitoring the progress using the NetData plugin.
Once NetData is installed and you can access it, you will be able to see each of the disk Read and Write thoughput graph. It will not tell you how long it is going to take to complete, but if you can see the past history (you will have to let the plugin collect the data) , you will be able to extrapolate the details.

I commented some weeks ago when I ran the test using Badblocks and my results where somewhere close to 24hr per read or write.
With a 4 pass on Badblocks, it would take around 8 days to complete.
 

phier

Patron
Joined
Dec 4, 2012
Messages
398
All the best to Joe Greco.

When I performed my 18TB burning test, I was monitoring the progress using the NetData plugin.
Once NetData is installed and you can access it, you will be able to see each of the disk Read and Write thoughput graph. It will not tell you how long it is going to take to complete, but if you can see the past history (you will have to let the plugin collect the data) , you will be able to extrapolate the details.

I commented some weeks ago when I ran the test using Badblocks and my results where somewhere close to 24hr per read or write.
With a 4 pass on Badblocks, it would take around 8 days to complete.
not sure how u tested 18TB drives with badblock as there was reported issue ie badblocks cant handle 18tb.

netdata...well deosnt have truenas its own graphs/stats?

thanks
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,450
not sure how u tested 18TB drives with badblock as there was reported issue ie badblocks cant handle 18tb.

netdata...well deosnt have truenas its own graphs/stats?

thanks
I used Spearfoot's script and if interrested I converted my disks to 4Kn.

https://www.truenas.com/community/t...n-testing-discussion-thread.21451/post-689875
https://www.truenas.com/community/t...n-testing-discussion-thread.21451/post-695447

About Netdata and TrueNAS reporting, there are differences but Netdata is more suitable in my opinion.
 

phier

Patron
Joined
Dec 4, 2012
Messages
398
@Apollo
not sure about performance, stability, consequences in case 4Kn is set on drives. I saw these posts but noone is summing up pro/cons/dangers.
also see comment >
badblocks is not designed for testing drives.


Regarding NetData i know it but also not sure how safe is it to install that on TrueNAS, i assume TrueNas doesnt have UI pluggin /click for that and it has to be done on my risk via terminal.

thanks
 
Last edited:

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,450
@Apollo
not sure about performance, stability, consequences in case 4Kn is set on drives. I saw these posts but noone is summing up pro/cons/dangers.
also see comment >
badblocks is not designed for testing drives.


Regarding NetData i know it but also not sure how safe is it to install that on TrueNAS, i assume TrueNas doesnt have UI pluggin /click for that and it has to be done on my risk via terminal.

thanks
For 4Kn/512e drives, I don't think there are penalties or issues.
I tried my Exos in mix 4Kn/4Kn and 4Kn/512e in same vdev and didn't notice anything peculiar. Performance and compatibility didn't seem to suffer.
The HDD controller on the disk is handling 4Kn/512e translation at a much faster rate than the physical medium can support.
If you were to use 512n drive only (Non capable 4K) then you would get performance issues. Assuming disk size being the same, there would be a difference in occupied space on the medium so compatibility and performance could be seen here.

You can install Netdata from the TrueNAS pluggin page as long as you select the "Community" section under the drop down list.

1654795709409.png
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
he is to undergo heart surgery today
@jgreco Take care of yourself my friend. 3 years ago I had quadruple bypass, no fun but I'm alive and my life was back to normal after a few months, back at work in 1 month (I pushed myself to get better as fast as I could safely, you can only watch so much TV). So I hope your hospital stay is short and sweet, you have no complications. and you recover fast.
 

phier

Patron
Joined
Dec 4, 2012
Messages
398
hello
running solnet-array-test script on 2 drives, i already have these processes running,
is it okay to send a kill signal -SIGINFO to dd command to get a progress/status?


root 23758 0.5 0.0 15424 3684 3 DLC+ Mon03 21:50.33 dd if=/dev/ada0 of=/dev/null bs=1048576
root 23745 0.4 0.0 15424 3684 3 DLC+ Mon03 21:50.75 dd if=/dev/ada0 of=/dev/null bs=1048576
root 23709 0.0 0.0 15424 3684 3 DLC+ Mon03 24:46.01 dd if=/dev/ada0 of=/dev/null bs=1048576
root 23710 0.0 0.0 15424 3684 3 DLC+ Mon03 11:03.92 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23729 0.0 0.0 15424 3684 3 DLC+ Mon03 24:55.92 dd if=/dev/ada0 of=/dev/null bs=1048576
root 23730 0.0 0.0 15424 3684 3 DLC+ Mon03 10:53.30 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23746 0.0 0.0 15424 3684 3 DLC+ Mon03 10:48.69 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23759 0.0 0.0 15424 3684 3 DLC+ Mon03 10:30.54 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23790 0.0 0.0 15424 3684 3 DLC+ Mon03 6:31.96 dd if=/dev/ada0 of=/dev/null bs=1048576
root 23791 0.0 0.0 15424 3684 3 DLC+ Mon03 13:12.84 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23803 0.0 0.0 15424 3684 3 DLC+ Mon03 4:47.67 dd if=/dev/ada0 of=/dev/null bs=1048576
root 23805 0.0 0.0 15424 3684 3 DLC+ Mon03 13:11.61 dd if=/dev/ada1 of=/dev/null bs=1048576


also as per estimate it was estimated that such a test will take 23hours... its already 3days and still not completed :(

Performing initial parallel seek-stress array read
Mon Jun 13 03:29:39 PDT 2022
The disk ada0 appears to be 17166336 MB.
Disk is reading at about 224 MB/sec
This suggests that this pass may take around 1276 minutes

Serial Parall % of
Disk Disk Size MB/sec MB/sec Serial
------- ---------- ------ ------ ------
ada0 17166336MB 261 233 89
ada1 17166336MB 271 221 82

Awaiting completion: initial parallel seek-stress array read
 
Last edited:

phier

Patron
Joined
Dec 4, 2012
Messages
398
Also one more observation ada0 is
Western Digital Ultrastar DC HC550
and
Seagate exos
Device Model: ST18000NM000J-2TV103

just wondering how is it possible that Ultrastar is almost 1/2 slower in the read speed.


1655496620527.png
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
170
just wondering how is it possible that Ultrastar is almost 1/2 slower in the read speed.

Is there some kind of difference in connection, because it does not look like the top chart is disk-limited, more like it is bus-limited?
 

phier

Patron
Joined
Dec 4, 2012
Messages
398
its strange... right now dd on ada0 is done, but speed on ada1 is now 1/2
1655592449605.png


right now 6 processes running in parallel and its I/O intensive workload?

root 23710 0.0 0.0 15424 3684 3 DLC+ Mon03 24:25.50 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23730 0.0 0.0 15424 3684 3 DLC+ Mon03 23:53.23 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23746 0.0 0.0 15424 3684 3 DLC+ Mon03 25:30.00 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23759 0.0 0.0 15424 3684 3 DLC+ Mon03 25:29.25 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23791 0.0 0.0 15424 3684 3 DLC+ Mon03 29:34.13 dd if=/dev/ada1 of=/dev/null bs=1048576
root 23805 0.0 0.0 15424 3684 3 DLC+ Mon03 24:22.17 dd if=/dev/ada1 of=/dev/null bs=1048576

the executed test currently running is :

Awaiting completion: initial parallel seek-stress array read
 

phier

Patron
Joined
Dec 4, 2012
Messages
398
so it finished after 17 days?

Awaiting completion: initial parallel seek-stress array read
Sun Jun 19 22:25:48 PDT 2022
Completed: initial parallel seek-stress array read

Disk's average time is 394791 seconds per disk

Disk Bytes Transferred Seconds %ofAvg
------- ----------------- ------- ------
ada0 18000207937536 242230 61 ++FAST++
ada1 18000207937536 547352 139 --SLOW
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,450
Also one more observation ada0 is
Western Digital Ultrastar DC HC550
and
Seagate exos
Device Model: ST18000NM000J-2TV103

just wondering how is it possible that Ultrastar is almost 1/2 slower in the read speed.


View attachment 56194
@phier,
You are not looking at the full picture with a one hour span graph.
The ST18000NM000J-2TV103 ranges from around 260MB/s to about 115MB/s throughput during read/write test procedure.

At least, you need to look at the graph over 24hr span to know really what is happening. (this is where NetData has the edge as it can record the last 7 days by default).
 

NinthWave

Contributor
Joined
Jan 9, 2021
Messages
129
Well, it was my first disk replacement and I forgot how I tested all 5 drives before creating the pool a few years ago.

So I put the "da4" drive OFFLINE (out of 5 drives)
Put the new drive in the same bay (I forgot I had to test it)
Ran short rest
Ran long test
Is running badblocl test

Now I realize I have a degraded zpool for as long as the 5th drive is not ready.

What should I do?
Should I just stop the tests, run the "replace" command, resilver and hope fort the best ?
 
Top