DanSputnikk
Cadet
- Joined
- Mar 21, 2022
- Messages
- 5
In a bit of a panic while I wait for the last VM to move off ZFS onto local storage before I do anything else.
Single ZFS pool serving VMWare:
I got an alert that da4 is going bad. Terminal agrees with this:
I started panicking and moving everything off ZFS. Suddenly, the UI stopped working. nginx was fine; middlewared I kicked but it hung.
iSCSI kept working, I can see VMWare chipping away and the new datastore is being filled 100%.
Suddenly, 30 minutes later (VMWare vMotion still in progress), the UI comes back and throws another alert; This time for a serial that simply doesn't exist:
This is my SN list (produced by script found here: https://www.truenas.com/community/t...y-disk-drives-device-name-serial-gptid.60497/ ):
3SL13H0L is nowhere to be found. Granted, these are pillaged drives from an old server rack and there have been a few disks to have gone bad but in almost all cases, the zpool was wiped and recreated.
Where is TrueNAS getting this serial number from I ask?
And any ideas why middlewared would faulter just because a single disk failed?
(This is a Dell R515 with a PERC ... h200 i think? [the one that doesn't support JBOD or pass-through] flashed to IT mode)
Single ZFS pool serving VMWare:
root@truenas[~]# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
Vmware Pool 1 2.97T 699G 2.29T - - 1% 22% 1.00x DEGRADED /mnt
raidz1 2.97T 699G 2.29T - - 1% 23.0% - DEGRADED
gptid/83b1983c-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/83c465a7-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/8488ea3c-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/84c43588-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/84f6b8f8-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/84fd53e5-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/850ae122-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/851cea4a-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/8521cc16-a394-11ec-ac7a-000af7015b24 - - - - - - - - FAULTED
gptid/853294e8-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
gptid/85a1b538-a394-11ec-ac7a-000af7015b24 - - - - - - - - ONLINE
cache - - - - - - - - -
gptid/85890e42-a394-11ec-ac7a-000af7015b24 279G 270G 9.87G - - 0% 96.5% - ONLINE
boot-pool 262G 1.20G 261G - - 0% 0% 1.00x ONLINE -
mirror 262G 1.20G 261G - - 0% 0.45% - ONLINE
da0p2 - - - - - - - - ONLINE
da1p2 - - - - - - - - ONLINE
I got an alert that da4 is going bad. Terminal agrees with this:
I started panicking and moving everything off ZFS. Suddenly, the UI stopped working. nginx was fine; middlewared I kicked but it hung.
iSCSI kept working, I can see VMWare chipping away and the new datastore is being filled 100%.
Suddenly, 30 minutes later (VMWare vMotion still in progress), the UI comes back and throws another alert; This time for a serial that simply doesn't exist:
This is my SN list (produced by script found here: https://www.truenas.com/community/t...y-disk-drives-device-name-serial-gptid.60497/ ):
+========+==========================+==================+============================================+
| Device | DISK DESCRIPTION | SERIAL NUMBER | GPTID |
+========+==========================+==================+============================================+
| da0 | HITACHI HUC109030CSS600 | KLG866KF | gptid/e7b0a6d3-9a24-11ec-a169-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da1 | HITACHI HUC109030CSS600 | KLGAH0GF | gptid/e7ec358e-9a24-11ec-a169-000af7015b24
gptid/83c465a7-a394-11ec-ac7a-000af7015b24
gptid/850ae122-a394-11ec-ac7a-000af7015b24
gptid/851cea4a-a394-11ec-ac7a-000af7015b24
gptid/85890e42-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da3 | SEAGATE ST3300657SS | 6SJ4AN5G | gptid/83b1983c-a394-11ec-ac7a-000af7015b24
gptid/838f8ac1-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da12 | SEAGATE ST3300657SS | 6SJ83WZJ0000N41226HE | gptid/83c465a7-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da5 | SEAGATE ST300MP0026 | WAE29VEN | gptid/8488ea3c-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da6 | SEAGATE ST300MP0026 | WAE29SGE | gptid/84c43588-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da8 | HITACHI HUS156030VLS600 | JXVH6YXJ | gptid/84f6b8f8-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da7 | SEAGATE ST3300657SS | 6SJ5S4ST | gptid/84fd53e5-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da10 | SEAGATE ST3300657SS | 6SJ5SZ5S | gptid/850ae122-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da11 | SEAGATE ST3300657SS | 6SJ4G2BZ | gptid/851cea4a-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da4 | SEAGATE ST3300657SS | 6SJ5SZL3 | gptid/8521cc16-a394-11ec-ac7a-000af7015b24
gptid/84bfd652-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da9 | SEAGATE ST3300657SS | 6SJ8WNQJ | gptid/853294e8-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da13 | SEAGATE ST3300555SS | 3LM4AEJC | gptid/85890e42-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da2 | SEAGATE ST3300657SS | 6SJ4JVLK | gptid/85a1b538-a394-11ec-ac7a-000af7015b24
gptid/857bc56c-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da4 | SEAGATE ST3300657SS | 6SJ5SZL3 | gptid/8521cc16-a394-11ec-ac7a-000af7015b24
gptid/84bfd652-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da3 | SEAGATE ST3300657SS | 6SJ4AN5G | gptid/83b1983c-a394-11ec-ac7a-000af7015b24
gptid/838f8ac1-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
| da2 | SEAGATE ST3300657SS | 6SJ4JVLK | gptid/85a1b538-a394-11ec-ac7a-000af7015b24
gptid/857bc56c-a394-11ec-ac7a-000af7015b24 |
+--------+--------------------------+------------------+--------------------------------------------+
3SL13H0L is nowhere to be found. Granted, these are pillaged drives from an old server rack and there have been a few disks to have gone bad but in almost all cases, the zpool was wiped and recreated.
Where is TrueNAS getting this serial number from I ask?
And any ideas why middlewared would faulter just because a single disk failed?
(This is a Dell R515 with a PERC ... h200 i think? [the one that doesn't support JBOD or pass-through] flashed to IT mode)