Alert for an non-existing physical drive?

DanSputnikk

Cadet
Joined
Mar 21, 2022
Messages
5
In a bit of a panic while I wait for the last VM to move off ZFS onto local storage before I do anything else.

Single ZFS pool serving VMWare:
root@truenas[~]# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
Vmware Pool 1 2.97T 699G 2.29T - - 1% 22% 1.00x DEGRADED /mnt
raidz1 2.97T 699G 2.29T - - 1% 23.0% - DEGRADED
gptid/83b1983c-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/83c465a7-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/8488ea3c-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/84c43588-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/84f6b8f8-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/84fd53e5-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/850ae122-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/851cea4a-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/8521cc16-a394-11ec-ac7a-000af7015b24 - - - - - - - - FAULTED
gptid/853294e8-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/85a1b538-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
cache - - - - - - - - -
gptid/85890e42-a394-11ec-ac7a-000af7015b24 279G 270G 9.87G - - 0% 96.5% - ONLINE
boot-pool 262G 1.20G 261G - - 0% 0% 1.00x ONLINE -
mirror 262G 1.20G 261G - - 0% 0.45% - ONLINE
da0p2 - - - - - - - - ONLINE
da1p2 - - - - - - - - ONLINE

I got an alert that da4 is going bad. Terminal agrees with this:

1647905126887.png


I started panicking and moving everything off ZFS. Suddenly, the UI stopped working. nginx was fine; middlewared I kicked but it hung.

iSCSI kept working, I can see VMWare chipping away and the new datastore is being filled 100%.

Suddenly, 30 minutes later (VMWare vMotion still in progress), the UI comes back and throws another alert; This time for a serial that simply doesn't exist:

1647905467751.png


This is my SN list (produced by script found here: https://www.truenas.com/community/t...y-disk-drives-device-name-serial-gptid.60497/ ):

+========+==========================+==================+============================================+
| Device | DISK DESCRIPTION | SERIAL NUMBER | GPTID |
+========+==========================+==================+============================================+
| da0 | HITACHI HUC109030CSS600 | KLG866KF | gptid/e7b0a6d3-9a24-11ec-a169-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da1 | HITACHI HUC109030CSS600 | KLGAH0GF | gptid/e7ec358e-9a24-11ec-a169-000af7015b24
gptid/83c465a7-a394-11ec-ac7a-000af7015b24
gptid/850ae122-a394-11ec-ac7a-000af7015b24
gptid/851cea4a-a394-11ec-ac7a-000af7015b24
gptid/85890e42-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da3 | SEAGATE ST3300657SS | 6SJ4AN5G | gptid/83b1983c-a394-11ec-ac7a-000af7015b24
gptid/838f8ac1-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da12 | SEAGATE ST3300657SS | 6SJ83WZJ0000N41226HE | gptid/83c465a7-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da5 | SEAGATE ST300MP0026 | WAE29VEN | gptid/8488ea3c-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da6 | SEAGATE ST300MP0026 | WAE29SGE | gptid/84c43588-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da8 | HITACHI HUS156030VLS600 | JXVH6YXJ | gptid/84f6b8f8-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da7 | SEAGATE ST3300657SS | 6SJ5S4ST | gptid/84fd53e5-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da10 | SEAGATE ST3300657SS | 6SJ5SZ5S | gptid/850ae122-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da11 | SEAGATE ST3300657SS | 6SJ4G2BZ | gptid/851cea4a-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da4 | SEAGATE ST3300657SS | 6SJ5SZL3 | gptid/8521cc16-a394-11ec-ac7a-000af7015b24
gptid/84bfd652-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da9 | SEAGATE ST3300657SS | 6SJ8WNQJ | gptid/853294e8-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da13 | SEAGATE ST3300555SS | 3LM4AEJC | gptid/85890e42-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da2 | SEAGATE ST3300657SS | 6SJ4JVLK | gptid/85a1b538-a394-11ec-ac7a-000af7015b24
gptid/857bc56c-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da4 | SEAGATE ST3300657SS | 6SJ5SZL3 | gptid/8521cc16-a394-11ec-ac7a-000af7015b24
gptid/84bfd652-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da3 | SEAGATE ST3300657SS | 6SJ4AN5G | gptid/83b1983c-a394-11ec-ac7a-000af7015b24
gptid/838f8ac1-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da2 | SEAGATE ST3300657SS | 6SJ4JVLK | gptid/85a1b538-a394-11ec-ac7a-000af7015b24
gptid/857bc56c-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+

3SL13H0L is nowhere to be found. Granted, these are pillaged drives from an old server rack and there have been a few disks to have gone bad but in almost all cases, the zpool was wiped and recreated.

Where is TrueNAS getting this serial number from I ask?

And any ideas why middlewared would faulter just because a single disk failed?

(This is a Dell R515 with a PERC ... h200 i think? [the one that doesn't support JBOD or pass-through] flashed to IT mode)
 

DanSputnikk

Cadet
Joined
Mar 21, 2022
Messages
5
Another alert depicting the same:

1647909886725.png


And since I have a gui now, for reference that 3Sl13H0L is not a real drive:

1647910094120.png
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
While dmesg is showing da4 is the disk with problems, I would expect you can trust that.

Run smartctl -a /dev/da4 and see for yourself (and also confirm the serial number there).

You may find the "serial number" referenced is actually another identifier.
 

DanSputnikk

Cadet
Joined
Mar 21, 2022
Messages
5
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p12 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST3300657SS
Revision: ES66
Compliance: SPC-3
User Capacity: 300,000,000,000 bytes [300 GB]
Logical block size: 512 bytes
Rotation Rate: 15000 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c500595fdcb7
Serial number: 6SJ5SZL3
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Tue Mar 22 08:22:45 2022 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported

Yeah I trust the console. Just weird that TrueNAS is reporting some other serial number despite all evidence against.
 
Top