Bad blocks not being remapped

Status
Not open for further replies.

Revilo

Dabbler
Joined
Oct 15, 2013
Messages
25
Over the past few months I have been noticing that after a volume scrub I would have a load of corrupt files that have to be deleted. After the 3rd time I got a bit worried and investigated a bit further and after looking through logs and doing tests on the drive using my windows machine and HD Tune, I found that there were a number (16*22MB) bad blocks (this was a read test not a write test).

Now usually I would have expected these blocks to have been retired/remapped and others to take their place but this wasn't happening. (I also thought it was the SSD Hardware's job to do this but that doesn't seem the case *i'm not familiar with how it works, sorry!*

NOTE ** My build consists of single disk pools (I know, no redundancy, pointless even having ZFS, but I just don't have the money or the absolute necessity to have more than a few off-site backups of critical data). This makes bad sectors even worse for me. **

My question is, why were these blocks not being flagged by FreeNAS? Will it happen to another drive or is it only this drive? Is it the FreeNAS system causing this because after I replaced the drive and completed a full format with the windows formatting utility; all the errors cleared up fine.

The drive in question is an SSD drive (CSSD-V60GB2) which had the ".system" dataset, syslog and multiple jails on it. (I did this so a mechanical drive didn't have to put up with the high volume of small writes from logs etc.) I had snapshots being taken every few days also (these also became corrupt too).

Specs:
System: Dell Optiplex 755 (standard dell motherboard) using Intel ICH9 AHCI controller
CPU make/model: Intel Core 2 Duo CPU E8400
RAM make/model: 4 x 2GB Samsung DDR2 800Mhz
harddives (all SATA, all single drive/separate data pools, all ZFS stripe):
- 2TB (ST2000DM001-1CH164)
- 320GB (WD3200BJKT)
- 500GB (ST3500312CS)
- 60GB (CSSD-V60GB2)
 
Last edited:
Joined
Oct 2, 2014
Messages
925
Can we get some system spec's please, as per forum rules. Motherboard make/model,CPU make/model, RAM make/model/ECC/Non ECC, harddives make/model/amount/what RAIDz
 
Joined
Oct 2, 2014
Messages
925
I see you updated your OP with system spec's so thanks for that, what does the output of SMART tell you? Have you done any short or long SMART tests? Are the scrubs youre doing manual or scheduled?

Do SMART for each drive and post the results using [ code ] data here [ /code ] (remove the spaces for it to work)

smartctl -a /dev/adaX (with X a number, for example ada0 for the first drive), it's maybe daX instead of adaX.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
after a volume scrub I would have a load of corrupt files that have to be deleted
why were these blocks not being flagged by FreeNAS?
The scrubs found them for you. What were you expecting?
My build consists of single disk pools (I know, no redundancy, pointless even having ZFS, but I just don't have the money or the absolute necessity to have more than a few off-site backups of critical data). This makes bad sectors even worse for me.
This seems paradoxical to me.
CSSD-V60GB2
This model seems to have a pretty bad reputation. Probably best to replace it.

Have you done thorough tests on your RAM? You system doesn't support ECC, so bad RAM could be causing problems.
 

Revilo

Dabbler
Joined
Oct 15, 2013
Messages
25
I decided to replicate the jails to another drive and pull the faulty drive from the system. I have done a full format and the 'faulty' drive's errors were cleared up.
@robert, good shout, I will do a memory check to make sure it isn't the memory that is at fault; however I would have expected to find these errors on other drives if that was the case but it is only this single drive.
@Darren, the drive is no longer in the freenas system but I had periodic smart tests on all drives and I also ran separate tests on the faulty drive and the smart data told me it passed both long and short tests without and read errors however, some manufacturer specific smart data looks very high but I cant find what those attributes mean anywhere. Anyway, this is what I get (I hope this image format is okay, also forget about the other drives in the corsair smart data image, this is another other pc):

smartData.png
HDTuneBadSectors.png
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Drives reassign LBAs as needed internally. The OS is not involved.

Also, I highly doubt that any utility will be of any use in locating bad sectors (as opposed to determining if there are any bad sectors), especially on an SSD, where there is no externally-known mapping of LBAs to cells.

Instead of trying to patch the rusting, sinking ship, get a proper ship. That means proper redundancy on proper hardware.
 
Status
Not open for further replies.
Top