Issue with Badblocks running on FreeNAS?

Status
Not open for further replies.

SubnetMask

Contributor
Joined
Jul 27, 2017
Messages
129
I got eight used 2TB HGST SAS drives the end of December and per recommendations here, started running Badblocks on them. Since FreeNAS is BSD based, I ran the command to allow it to run right, and it's been running on all eight drives - since 12/22. So they've been running for just over three weeks. Does this seem normal, or is something not right? The other thing is I ran badblocks from the physical console so that I wouldn't have to mess with TMux and such over SSH - the problem is at this point, the console seems totally frozen - I can't switch between sessions, and the one it's on isn't updating. FreeNAS itself is still responsive, it's still serving up the volume to VMWare, there is activity showing on all of the drives, and badblocks does seem to be running as one drive had some bad blocks, and the log file for those bad blocks was updated on 1/5 with more bad blocks found. It also started updating with more bad blocks when I pulled that drive to send back, so I had to kill the PID for that process.

Edit - I left out that I'm running BB using 'badblocks -b 512 -wsv -o "da11.bb" /dev/da11', not using a larger block size because from what I've read, specifying a size other than the block size on the drive can result in bad blocks being missed.

Should I be concerned that it's taking so long, and that the console is frozen?
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Should I be concerned that it's taking so long, and that the console is frozen?
I'd say yes--neither of those sounds at all normal. When I've run badblocks on 6 TB disks, it completed within a week.
 

millst

Contributor
Joined
Feb 2, 2015
Messages
141
Yes. My 8TB drives took ~4 days.

Did you follow the guide closely? It mentions a kernel flag and how it shouldn't be run on a production system. Detaching from the terminal would have let you log in remotely and see what's going on (even with the frozen term).

-tm
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
The sysctl setting is a "please let me blow away my system drive" option and should not be used.

As far as why it is taking so long, 2TB are probably 4K blocks natively. So using a block size of 512 is going to have some serious write amplification. If there are actually any bad blocks on the drives, the retries can add a lot of time.

The "console not responding" is not clear. If it's one of the consoles running a program, pressing ctrl-T should show something.
 

SubnetMask

Contributor
Joined
Jul 27, 2017
Messages
129
I've since rebooted that FreeNAS and moved four of the drives over to another machine that's running Ubuntu and essentially dedicated to running badblocks. The one they were running on wasn't 'production', but it did have a VMWare datastore on it on the other four drives in it.

As far as the guide, there isn't much to it (The smartctl tests don't seem to work right on these SAS drives), and yes, I did run the '
sysctl kern.geom.debugflags=0x10' beforehand. If that's to allow raw disk I/O, how will badblocks run if you don't run that command?

As far as the console being frozen, you know how you can use ALT+F1, F2, F3, etc to get to different console sessions? The screen was frozen and I was not able to switch between sessions. It was totally unresponsive.

As far as the drives and sector size, they are HUS723020ALS640 (7K3000 series) drives, which according to the datasheet, have sector sizes of 512, 520 or 528. These were formatted 520 because they were from an EMC Clarion best I can tell, but have been re-formatted to 512 bytes. These drives don't have the 'AF' logo on them. I believe it wasn't until the 7K4000 drives that 4k sectors became available, and even then, it was optional - not every drive/model had 4k sectors.
 
Last edited:

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
sysctl kern.geom.debugflags=0x10' beforehand. If that's to allow raw disk I/O, how will badblocks run if you don't run that command?

It really has nothing to do with raw I/O. Instead, it is a safety feature that prevents overwriting drives with active read/write mounts. In other words, drives that are actually being used. So setting that sysctl means "yes, make it possible to overwrite my actual, in-use data."
 

SubnetMask

Contributor
Joined
Jul 27, 2017
Messages
129
It really has nothing to do with raw I/O. Instead, it is a safety feature that prevents overwriting drives with active read/write mounts. In other words, drives that are actually being used. So setting that sysctl means "yes, make it possible to overwrite my actual, in-use data."

Ah. So basically, if the drives are freshly inserted and not assigned to anything other than being a device listed in /dev/, badblocks won't have any issues doing what it needs to do? If so, that would be good info to add to the 'howto'.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
The drive does not even have to be fresh. It really just needs to not be in use, which essentially means "does not have a filesystem mounted", although there are some GEOM things that could also do it. When that happens, the GEOM system blocks writing to it, so it gives an error message.

I sent a PM to the author of that document asking about fixing the description of that sysctl, but have had no response. I should probably go add some text to it.
 
Status
Not open for further replies.
Top