1st storage server - FreeNAS!

Sir SSV · Aug 5, 2017

Stux said:
Are you sure the smart short tests took an hour :/

Yes, here's a screengrab whilst the last long test is being completed

Stux · Aug 5, 2017

That says "2 minutes"

Sir SSV · Aug 5, 2017

Stux said:
That says "2 minutes"

So it does! I was looking further down where it lists both short and long tests

Thanks for picking this up! I'll amend my prior post about short/long/badblock times

wblock · Aug 5, 2017

Sir SSV said:
To perform raw disk i/o, enable the kernel geometry debug flags

That is not quite correct. The debug flags disable a safety that prevents raw I/O to devices that are in use by the GEOM system.

Please do not use that routinely, or recommend it without a warning that it defeats a safety that protects the system.

Sir SSV · Aug 5, 2017

wblock said:
That is not quite correct. The debug flags disable a safety that prevents raw I/O to devices that are in use by the GEOM system.

Please do not use that routinely, or recommend it without a warning that it defeats a safety that protects the system.

Thanks for this. I have edited my previous post :)

Sir SSV · Aug 6, 2017

Badblocks finally completed as well as smart long test. First 12 drives finished testing with zero errors so very happy :)

Have just loaded the next 12 drives into their bays and have completed a smart short test. Long test currently underway and should be completed by 6pm tomorrow; then on to badblock testing and a final smart long test.

For reference, my temperatures are as follows. Have noticed a slight increase in CPU and ram temps but nothing to be alarmed about. This has occurred due to filling the last 12 bays with hard drives.

CPU - 47 degrees
Ram - 39/40 degrees
Hdd- 32 to 38 degrees

I've noticed on another forum where someone is saying that all 8tb drives are bad. They have not provided any hard data to back up their claims but made for an interesting read. What is the general consensus here regarding 8tb (and even 10tb) drives? Not that it would make much difference to me as I've just gone and purchased 24 of them :D

nightshade00013 · Aug 6, 2017

It's too early to say a particular drive is just bad news.

If you really want some hard stats https://www.backblaze.com/blog/hard-drive-failure-rates-q1-2017/ will have a good chunk of it. Right now the failure rates are below 5% so I would call it negligible.

Sir SSV · Aug 7, 2017

The FreeNAS system really does surprise me even though I am still learning it and all it's functions. Late last night I received an email that one of my harddrives (da18) had 8 uncorrectable sectors. This was happening whilst smart long test were being carried out on the remaining 12 drives.

All I can say is thank you for this forum and it's great articles! The burn-in guide I linked to in a previous page certainly came in handy and I'd say prevented me from placing data across a drive that was not in the best of health.

Moving forward, I have already ordered a replacement drive this morning and am hoping to see it either tomorrow or Wednesday. For now, I have shut down the server and will wait for the replacement drive to smart short & long test then badblock test all 12 drives simultaneously.

Stux · Aug 7, 2017

Not a good sign for that HD that its already lost 8 sectors. But this is precisely the reason that you run the badblock tests followed by the long tests. Now that data has been written to every sector the long test will test if all the sectors can actually be read.

In this case, 8 sectors were unable to return their data. The drive is waiting (pending) for you to decide to re-write the data.

RMA :)

Sir SSV · Aug 7, 2017

Stux said:
Not a good sign for that HD that its already lost 8 sectors. But this is precisely the reason that you run the badblock tests followed by the long tests. Now that data has been written to every sector the long test will test if all the sectors can actually be read.

In this case, 8 sectors were unable to return their data. The drive is waiting (pending) for you to decide to re-write the data.

RMA :)

That is exactly right. What I found unusual is I had only carried out the smart short and long tests. I still hadn't started badblock testing

the errors occurred in long tests.
I guess the drive was bad from the beginning

wblock · Aug 7, 2017

Sir SSV said:
I've noticed on another forum where someone is saying that all 8tb drives are bad.

Link?

Sir SSV · Aug 7, 2017

wblock said:
Link?

http://www.overclock.net/t/1634071/...ware-higher-rate-of-failure-degraded-surfaces

Sir SSV · Aug 7, 2017

Sir SSV said:
http://www.overclock.net/t/1634071/...ware-higher-rate-of-failure-degraded-surfaces

wblock: would be very interested to hear your thoughts on this

wblock · Aug 7, 2017

It sounds reasonable. He did not say that the 8TB drives were "bad", just that he had seen some with early failures. That would not surprise me with the first generation of higher-density drives. It often goes that way. Vendor warranties should help tell the story. If the same warranty is offered, then the vendors have no reason to think those drives will have shorter lives. If they turn out to be wrong, the customer gets replacement drives that probably have engineering improvements.

Sir SSV · Aug 7, 2017

wblock said:
It sounds reasonable. He did not say that the 8TB drives were "bad", just that he had seen some with early failures. That would not surprise me with the first generation of higher-density drives. It often goes that way. Vendor warranties should help tell the story. If the same warranty is offered, then the vendors have no reason to think those drives will have shorter lives. If they turn out to be wrong, the customer gets replacement drives that probably have engineering improvements.

It would be interesting to see what the warranty reports said caused the failures... I guess we'll never know.

Do you happen to know what generation the Red drives are on now?

Stux · Aug 7, 2017

A big point of his is that the 8TB drives are not being tested to detect failures... use scrubs and long tests and double redundancy and a backup (and an offsite backup!), and you should be good ;)

Sir SSV · Aug 8, 2017

New WD Red 8tb harddrive showed up today, so into the server it went (replacing the old drive with failures) and already completed a smart short test.
Long test has commenced and should be finished by tomorrow afternoon.
All going well, badblock testing will commence tomorrow after the smart long test and hopefully there are no errors :)

Sir SSV · Aug 10, 2017

Smart long test has been completed on the replacement drive and no errors reported.
Have commenced badblock testing on the remaining 12 drives. Should be completed by Tuesday next week

I will update this thread with results once the testing has been completed.

In the meantime, I am still undecided on what raid to implement. I have been looking at the following

* 4 vdevs of 6 drives - Raid z2
* 3 vdevs of 8 drives - Raid z2
* 2 vdevs of 12 drives - Raid z3

What are people's recommendations? The server will only have music/blu ray backups stored on it serving 2 desktop computers, 2 laptops and 2 htpc's

Arwen · Aug 10, 2017

Sir SSV said:
...
In the meantime, I am still undecided on what raid to implement. I have been looking at the following

* 4 vdevs of 6 drives - Raid z2
* 3 vdevs of 8 drives - Raid z2
* 2 vdevs of 12 drives - Raid z3

What are people's recommendations? The server will only have music/blu ray backups stored on it serving 2 desktop computers, 2 laptops and 2 htpc's

Note that the following use the same amount of parity disks, 6. But, the RAID-Z2 will get slight better read and write performance as it's 3 x vDevs, each with one disk less parity.

3 vDevs of 8 drives - RAID-Z2
2 vDevs of 12 drives - RAID-Z3

However, since writes would be less important for a mostly read media server, you might take that in mind. I also like free slots and or warm spares. So something like this;

1 vDev of 11 drives - RAID-Z3
1 vDev of 12 drives - RAID-Z3
Warm spare, or free slot and cold spare

You may even be able to get away with RAID-Z2. It's pushing the limit for RAID-Z2 at 11/12 disks. But with a warm spare ready to go, that helps reduce the risk.

Please note that (slightly) mis-matched vDevs are both allowed and won't impact performance much. Especially for the home or non-business use case.

Note: What I define as a warm spare, is a disk installed in the server, ready to replace a failed disk. But, requires operator intervention. Hot spares don't require operator intervention. And while hot spares are supported by ZFS, that feature is generally not used. Partly because you can't tell ZFS which vDev is more important to use a hot spare disk when you have failed disks in multiple vDevs.

Chris Moore · Aug 10, 2017

Sir SSV said:
That is exactly right. What I found unusual is I had only carried out the smart short and long tests. I still hadn't started badblock testing the errors occurred in long tests.
I guess the drive was bad from the beginning

Yes, the long test can find defects even before data is ever written, but some defects can't be found until data is first written and the attempt is made to read that data back. The thing I use to test my drives is a tool called DBAN Boot and Nuke. It is actually a utility to erase drives but one of the settings lets you have it verify each write pass. I configure it to do what is called a DOD short erase with a verify on each pass. This writes random data to every sector on the drive and attempts to read it back as a test that the drive is working properly. If there are enough errors, DBAN will say that the drive failed but it can pass DBAN standards and still have problems, so after I run that I check the drive status with a smart long test and look at the results. I don't put my data on the drive until it has passed this test with no bad or reallocated sectors. I have been burned by bad drives too many times to take a chance on one that is questionable. Just last month I replaced seven drives in one of my systems because they were getting old and I didn't trust them. I had already had to replace 5 drives in that system in the past year, so I replaced the rest so they are all within a year of the same age.
The first six months to a year of running time is the 'prime time' for drive failure though. I have a server that I manage at work that has 60 of the 6 TB drives from Western Digital in it and I have had to replace 3 of them already. I had to replace two in the first month it was running and another after it had only been online for six months.
The point of all that, never trust spinning rust.

Important Announcement for the TrueNAS Community.

1st storage server - FreeNAS!

Explorer

MVP

Explorer

Documentation Engineer

Explorer

Explorer

Wizard

Explorer

MVP

Explorer

Documentation Engineer

Explorer

Explorer

Documentation Engineer

Explorer

MVP

Explorer

Explorer

MVP

Hall of Famer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "1st storage server - FreeNAS!"

Similar threads