Sector error every 30 minutes

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
I have a 2017 vintage Dell T330 server with four SSDs and four SAS HDDs

Every 30 minutes (down to the second), I receive a sector error on one of the HDDs (/dev/sdg). It seems like that sector should have been swapped out if there were a real problem.

How do I address this problem?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
What does the error say, exactly? What's the output of smartctl -x /dev/sdg?
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
The error reported in the shell is:

2024 Feb 8 19:49:14 truenas Device: /dev/sdg [SAT], 1 Offline uncorrectable sectors

The output of smartctl is extensive so I've attached a text file to avoid it being indexed.
 

Attachments

  • sdg_error.txt
    18.1 KB · Views: 55

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
So, the disk is definitely on its way out, with 20+ bad sectors. You should replace it without delay.
Every 30 minutes (down to the second), I receive a sector error on one of the HDDs (/dev/sdg). It seems like that sector should have been swapped out if there were a real problem.
The sector is uncorrectable, i.e. unreadable (after multiple attempts, probably), so it can't be remapped until it is written to again.

Also, the SMART Tests log is pretty barren, make sure you setup regular SMART tests (especially long tests at least once or twice a month) after you replace the disk.
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
I'm a little surprised that neither the hard drive nor the file system aren't trying to move the sector.

Thanks for your insight.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
From the sound of it, ZFS hasn't tried to read the block yet and thus has not yet encountered any issues.
 

harsh

Dabbler
Joined
Feb 6, 2024
Messages
32
From the sound of it, ZFS hasn't tried to read the block yet and thus has not yet encountered any issues.
So the drive electronics aren't going to try to move the sector?

I can pretty easily cause any active sector to be read so I guess I'll do that to make the errors go away for now.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
There's nothing to move, it can't read the sector. What without it write?
 
Top