Error with drive in pool

Mitch2004

Dabbler
Joined
Dec 17, 2022
Messages
22
Hi All

I know this has probably been mentioned before hand and i have seen this issue a couple times but each scenario was slightly different or didn't quite match up with my specs so I am posting to be sure before I damage anything more.

TrueNAS specs
OS TrueNAS Scale, Bluefin running on two mirrored USB's
Hardware
Dell 720XD, with two E5-2650 V2, 4 x 16GB Samsung PC3-14900R ECC RAM,
12 x 6TB SAS Dell 7.2K in a RaidZ3
2 x 1TB SSD Samsung 870 EVO in a Mirror

History and Issue
Last night I updated from Angelfish to Bluefin, while I doing this I checked all my SAS drives via the shell console using
Code:
# fdisk -1 /dev/sd(Drive Letter)

All Drives reported back as 512 bytes as need by Bluefin, (Output below)

Disk /dev/sdn: 5.46 TiB, 6001175126016 bytes, 11721045168 sectors
Disk model: MG04SCA60EE
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D9012A69-2086-4F9A-80B7-38EBB47551C5

Device Start End Sectors Size Type
/dev/sdn1 128 4194304 4194177 2G Linux swap
/dev/sdn2 4194432 11721045134 11716850703 5.5T Solaris /usr & Apple ZFS

This morning I powered backup TrueNAS (Powered it down once it had finished because I was not able to leave it running due to needing to change the UPS out) and I noticed that my RaidZ3 was degraded and one drive is showing up as name: 3869088571791395513 Status UNAVAIL. When I use the Replace disk feature I can see the original disk that was there " sdb " but when I go to use the replace drive or force replace drive I get the blow error

Error: [EFAULT] Unable to GPT format the disk "sdb": Warning! Read error 5; strange behavior now likely! Warning: Partition table header claims that the size of partition table entries is 0 bytes, but this program supports only 128-byte entries. Adjusting accordingly, but partition table may be garbage. Warning! Read error 5; strange behavior now likely! Warning: Partition table header claims that the size of partition table entries is 0 bytes, but this program supports only 128-byte entries. Adjusting accordingly, but partition table may be garbage. Unable to save backup partition table! Perhaps the 'e' option on the experts' menu will resolve this problem. Warning! An error was reported when writing the partition table! This error MIGHT be harmless, or the disk might be damaged! Checking it is advisable.
close

Please also note drive sdb (the one showing the error above, Has been working fine for the last 2 months, only throw that error after Bluefin upgrade

Is anyone able to point me in the right direction for getting this fixed?
first time having a drive fail in TrueNAS.
 
Last edited by a moderator:

Mitch2004

Dabbler
Joined
Dec 17, 2022
Messages
22
Sorry about the Spelling, Just realized how bad it is,
 
Last edited by a moderator:
Joined
Oct 22, 2019
Messages
3,641
Can you run even a short SMART selftest on the drive in question?

EDIT: You know what. Bluefin is looking like an alpha quality release.
 

Mitch2004

Dabbler
Joined
Dec 17, 2022
Messages
22
Can you run even a short SMART selftest on the drive in question?
Just tried to, got the following error from within TrueNAS

Manual S.M.A.R.T. Test​

sdb​

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.79+truenas] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org Short offline self test failed [medium or hardware error (serious)]

Output when doing # fdisk -l /dev/sdb in shell

fdisk: cannot open /dev/sdb: Input/output error
 
Joined
Oct 22, 2019
Messages
3,641
Short offline self test failed [medium or hardware error (serious)]

fdisk: cannot open /dev/sdb: Input/output error

That does not look good. :oops:

This happened right after you upgraded to Bluefin and rebooted?

What does the error and selftest logs reveal?
Code:
smartctl -l error /dev/sdb

smartctl -l selftest /dev/sdb
 

Mitch2004

Dabbler
Joined
Dec 17, 2022
Messages
22
What does the error and selftest logs reveal?

Outputs are below, Same order as you listed above

root@truenas[~]# smartctl -l error /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.79+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 0 0 0 0 4759263.049 0
write: 0 23 23 23 43 160112.333 0
verify: 0 0 0 0 0 1332293.750 0

Non-medium error count: 5507

root@truenas[~]# smartctl -l selftest /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.79+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed 80 43957 - [- - -]
# 2 Background short Completed 80 43954 - [- - -]
# 3 Background short Completed 80 43952 - [- - -]
# 4 Background short Completed 80 43936 - [- - -]
# 5 Background short Completed 80 43895 - [- - -]
# 6 Background short Completed 80 43866 - [- - -]
# 7 Background short Completed 80 43858 - [- - -]
# 8 Background short Completed 80 43842 - [- - -]
# 9 Background short Completed 80 5 - [- - -]
#10 Reserved(7) Completed 64 5 - [- - -]

Long (extended) Self-test duration: 37873 seconds [631.2 minutes]
 
Joined
Oct 22, 2019
Messages
3,641
Okay, what is going on with Bluefin and these drives?

It's really making me nervous, and I'm glad I'm sticking with Core.

Might it be interrelated with these issues / threads?



Basically, installing or upgrading to Bluefin opens up new issues with certain types of drives that might not have been present with Angelfish.

---

Non-medium error count: 5507

I take it you already double-checked the cables and port connections?
 
Last edited:

Mitch2004

Dabbler
Joined
Dec 17, 2022
Messages
22
I take it you already double-checked the cables and port connections?
Yes I have checked all cables and connectors, they seam to be working correctly

I tried again today replacing the drive but still no luck, I did reboot TrueNAS again and now it is throwing a DIF error (See belo for full error) on all mechanical drives

I have also noticed that the drive label I am having issues with has changed from sdb to sda, Same physical drive from what I can see just the label has changed.

Checked the drives using # fdisk -l /dev/sd(drive letter) and if I am not mistaken it is reporting them as 512 Byte sectors, I am now wondering if bluefin is just really bugged.


DIFF Error after rebooting

Disk(s): sdb, sdi, sdm, sdl, sdn, sdc, sdd, sde, sdf, sdg, sdj are formatted with Data Integrity Feature (DIF) which is unsupported.​

2022-12-19 16:45:08 (Australia/Adelaide)

-----------------------------------------------
Output for # fdisk -l /dev/sd(drive letter)

root@truenas[~]# fdisk -l /dev/sdi
Disk /dev/sdi: 5.46 TiB, 6001175126016 bytes, 11721045168 sectors
Disk model: MG04SCA60EE
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 217350DA-CB37-4E41-A998-486235AE1D67

Device Start End Sectors Size Type
/dev/sdi1 128 4194304 4194177 2G Linux swap
/dev/sdi2 4194432 11721045134 11716850703 5.5T Solaris /usr & Apple ZFS
 

Daisuke

Contributor
Joined
Jun 23, 2011
Messages
1,041
Okay, what is going on with Bluefin and these drives? It's really making me nervous, and I'm glad I'm sticking with Core.
I explained into my troubleshooting thread the cause of these errors. Using Core does not address the root issue with your disks being badly formatted, once the kernel is updated, you will get the same warnings in Core also. Your solution is a bandaid to a real problem.
Basically, installing or upgrading to Bluefin opens up new issues with certain types of drives that might not have been present with Angelfish.
Which is a very good thing, most people never check the purchased drives on eBay etc., prior installing them into a Linux system and wonder why things go havoc.
 
Last edited:
Top