Extensive Scrub Duration

Status
Not open for further replies.

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
NTFS isn't recommended for data that is important. There's a ticket in bugs.freenas.org to make the importing be read-only in the future because of how many NTFS partitions it has eaten. It's not a FreeNAS problem, it's a FreeBSD problem. :/
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Yes, but I've given up on trying to recommend ECC anymore. If people don't want to read the well-written documentation we already provide I'm not going to waste my time further by forcing them to drink. :/
 

joelmusicman

Patron
Joined
Feb 20, 2014
Messages
249
I think there's a ticket in for this already but it'd be great if FreeNAS would check for ECC functionality and warn if it isn't active. That way it isn't just a bunch of guys telling people on forums about it. :)

Another thought: Take it one step further and add memtest functionality in the GUI that can be scheduled just like SMART tests & scrubs!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
You can't check for ECC functionality. Been discussed many many times. There are ways you can test *some* intel-based systems with *some* chipsets, but each chipset has it's own way of testing for it, and it's not even shared across motherboard manufacturers.
 

amitkhas

Dabbler
Joined
Oct 28, 2011
Messages
49
I did not know about ECC support. When I purchased the hardware 2 years back, I was not aware of the recommendation for ECC.

I don't think I ever looked into it either. I didn't even know i5-2500 did not support ECC. I thought as long as you had ECC supported memory, that was sufficient. I suppose I figured that other hardware are typically ECC compliant. Apparently not :(
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I did not know about ECC support. When I purchased the hardware 2 years back, I was not aware of the recommendation for ECC :(

I don't think I ever looked into it either. I didn't even know i5-2500 did not support ECC :( I thought as long as you had ECC supported memory, that was sufficient. I suppose I figured that other hardware are typically ECC compliant. Apparently not :(

It's an artificial limitation, so don't feel stupid. Intel limits ECC to server chipsets (even though the chipset has no interaction with RAM these days) with Xeon (or some i3 and Celeron models) CPUs. AMD is more liberal, but most motherboards don't support ECC.
 

amitkhas

Dabbler
Joined
Oct 28, 2011
Messages
49
So I've managed to back up the most of the critical data to another drive. However, I'm unable to grab everything since some files are not able to fully transfer. It'll attempt to, but after some time, it'll fail. Most likely, it's because of the SAMBA/Windows that Cyberjock mentioned.

As cyberjock previously mentioned, I should connect another drive directly to the system and copy the files that way. Since I don't have any more SATA slots, I am forced to use a USB drive. First, how do I get USB drive to be detected by FreeNAS? Does the drive have to formatted a certain way? Once FreeNAS sees the drive, how do I copy the files over to that drive? There is probably a command I need to issue via Shell.

Also, this is a really noob question: With the hard drives failing (i.e. having lots of bad sectors), could the ZFS scrub potentially correct this? It would take 200 hours to run, probably because of the number of bad sectors, but I wouldn't mind leaving it on if it means it may repair the disks.

Also, another very noob question: Based on the code print outs of the SMART values, you mentioned 4 disks are "failing." What do you mean by failing? There are a high number of bad sectors? I suppose, I am still surprised that 4 of 6 drives are failing when they are only 2 years old. They were not running 24/7. Only occasionally - perhaps 30hrs a week.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I have a suggestion for a completely different approach. The idea is to try to rescue the contents of each failing drive independently, using GNU ddrescue.
  1. Buy four replacements for your failing drives. Each replacement must be at least as large as the drive it will replace.
  2. Download and burn a bootable CD of Ubuntu Rescue Remix, or install GNU ddrescue on an existing Ubuntu system.
  3. Use ddrescue one disk at a time to clone each failing drive to a new drive.
  4. Boot your system with the two old drives and the four clones and run a ZFS scrub.
  5. Backup your data.
  6. Create a new RAIDZ2 pool.
  7. Restore your data.
I'm suggesting this because I've had great results rescuing data from failing drives using ddrescue.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
That could really screw ZFS up because multiple disks will appear to be the exact same disk to ZFS.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
How so?

Did you think I was proposing making multiple clones of the good disks? That's not what I suggested (in step 3).
 

amitkhas

Dabbler
Joined
Oct 28, 2011
Messages
49
Hi All,

I figured I would keep everyone updated with the status.

Using FTP, I managed to back-up the critical data. Of the 2.6 TB, 600GB was critical, and the other 2 TB was important but not critical. It took a VERY long time (~100 hrs to back up 600 GB). The transfer rates were terribly slow, but it did transfer. Backing up the other 2 TB (less important), would take an exorbitant amount of time, and that data isn't worth the time and effort required to rescue it.

I purchased 4x 2 TB WD RED disks. Since I backed up the critical stuff, I figured I would try to rebuild the RAIDz1. I replaced the disk with the highest unreadable sectors. The resilver rate is 4.5 MB/sec. That means, it will take approximately 150 hours (yikes!).

Is this a typical resilver rate for a 6 disk RAID-Z1?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Hi All,

I figured I would keep everyone updated with the status.

Using FTP, I managed to back-up the critical data. Of the 2.6 TB, 600GB was critical, and the other 2 TB was important but not critical. It took a VERY long time (~100 hrs to back up 600 GB). The transfer rates were terribly slow, but it did transfer. Backing up the other 2 TB (less important), would take an exorbitant amount of time, and that data isn't worth the time and effort required to rescue it.

I purchased 4x 2 TB WD RED disks. Since I backed up the critical stuff, I figured I would try to rebuild the RAIDz1. I replaced the disk with the highest unreadable sectors. The resilver rate is 4.5 MB/sec. That means, it will take approximately 150 hours (yikes!).

Is this a typical resilver rate for a 6 disk RAID-Z1?

Well, 6 disks is a suboptimal configuration for RAIDZ1 which does lead to slowdowns, but it still sounds absurdly low, so it's probably because of the failing disks.
 

amitkhas

Dabbler
Joined
Oct 28, 2011
Messages
49
I started the "replace disk" last night. It's about 11% complete.

This is a bit of a noobish question:

When I use the shell, I have the view settings to the maximum size. However, some of the text scrolls through after a command. There are no scroll bars. How do I view the full text (i.e. scroll up)?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I started the "replace disk" last night. It's about 11% complete.

This is a bit of a noobish question:

When I use the shell, I have the view settings to the maximum size. However, some of the text scrolls through after a command. There are no scroll bars. How do I view the full text (i.e. scroll up)?

I'm not sure you can, but you should be able to do so with an SSH client like PuTTY.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
You can also stop the output every page by adding | more to the end. Like this:

Code:
# smartctl -a /dev/ada0 | more


Or you can save the output to a file like this:

Code:
# smartctl -a /dev/ada0 > somefile.txt
 

amitkhas

Dabbler
Joined
Oct 28, 2011
Messages
49
The resilvering is taking an immense amount of time. At this point, I want to stop the resilvering and simply rebuild the pool. All data has been backed up.

How do I stop the resilvering?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Then stop the scrub (zpool scrub -s <poolname>) and destroy the pool. :P
 

amitkhas

Dabbler
Joined
Oct 28, 2011
Messages
49
It wouldn't let me stop the scrub because it said that the resilvering was in progress.

I ended up shutting down the system from the GUI. Now it won't book up because of ada4 (one of the failing disks).

Can I simply replace the failing disks and create a pool? Or will it give errors because 2 of the drives will be from the previous RAIDZ1 pool?
 
Status
Not open for further replies.
Top