truenas reporting bad drive that doesn't exist

Joined
Mar 5, 2022
Messages
224
I got this email:
* Pool pool state is DEGRADED: One or more devices are faulted in response to
persistent errors. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
The following devices are not healthy:

* Disk ATA WDC WD20EFZX-68A WD-WX*****47Y17 is FAULTED
When I check my disks there is no *47Y17:
Name Serial Disk Size Pool Model da10 NF9351W1*****0S30B 1.86 TiB plex ATA Lexar SSD NS100 da11 NF9351W1*****0S30B 1.86 TiB plex ATA Lexar SSD NS100 da13 NHH737W1*****0S30B 1.86 TiB plex ATA Lexar SSD NS100 da12 NHH737W1*****0S30B 1.86 TiB plex ATA Lexar SSD NS100 da4 PNY2144211*****02587 111.79 GiB boot-pool ATA PNY CS900 120GB da5 PNY2144211*****02589 111.79 GiB boot-pool ATA PNY CS900 120GB da9 WD-WX*****P9SZ6 1.82 TiB pool ATA WDC WD20EFZX-68A da1 WD-WX*****P9VC7 1.82 TiB pool ATA WDC WD20EFZX-68A da3 WD-WX*****2TL6K 1.82 TiB pool ATA WDC WD20EFZX-68A da8 WD-WX*****361E7 1.82 TiB pool ATA WDC WD20EFZX-68A da0 WD-WX*****7A0YV 1.82 TiB pool ATA WDC WD20EFZX-68A da2 WD-WX*****32NL9 1.82 TiB pool ATA WDC WD20EFZX-68A da7 WD-WX*****L9CVT 1.82 TiB pool ATA WDC WD20EFZX-68A da6 WD-WX*****LJUKF 1.82 TiB pool ATA WDC WD20EFZX-68A

What am I missing?
 
Last edited:
Joined
Mar 5, 2022
Messages
224
When I checked the pool status I see this:
Name Read Write Checksum Status
/mnt/pool 0 0 0 ONLINE RAIDZ2 0 0 0 ONLINE da8 0 0 0 ONLINE da2 0 0 0 ONLINE da0 0 0 0 ONLINE da1 0 0 0 ONLINE da3 0 0 1 ONLINE da6 0 0 0 ONLINE da9 0 0 0 ONLINE da7 0 0 0 ONLINE

so da3 has a checksum error. BUT, da3 is WD-WX*****2TL6K *NOT* WD-WX*****47Y17
 
Last edited:
Joined
Mar 5, 2022
Messages
224
I could scrub the pool, mark the drive as being OK, or replace WD-WX*****2TL6K and hope for the best...
 
Joined
Mar 5, 2022
Messages
224
I'm afraid of scrubbing until I get some confirmation from someone who has an inkling as to this issue. Any suggestions would be most appreciated
 
Joined
Mar 5, 2022
Messages
224
I'm still hoping that someone can help me with this
 

jlpellet

Patron
Joined
Mar 21, 2012
Messages
287
I'm not sure this will help, but 1st I'd go to GUI>Storage>Disks to see if any disks show as faulted. I'm not sure a faulted disk shows up in the pool status shell command. Also, you could try phsyically removing power/data cables from 47Y17 (with the system powered down) & see if reporting changes on boot. Your system is more complicated that I have experience with so I can only hope this helps. Good luck.
John
 
Joined
Oct 22, 2019
Messages
3,641
When I checked the pool status I see this:
Sadly, the GUI does not present the actual device paths that ZFS is using under-the-hood. (Not sure why iXsystems went with this decision. It only causes confusion and presents inaccurate information.)

You'll need to invoke "zpool status" in an SSH session:
Code:
zpool status -v pool
 
Last edited:
Joined
Mar 5, 2022
Messages
224
@jlpellet thank you very much for getting back to me. I have never been able to find the status of drives at GUI>Storage>Disks. I usually run smart_report.sh from the FreeNAS-scripts-master library.

@winnielinnie thank you for the suggestion to run sudo zpool status -v pool which produced the following (again showing the drive with the checksum error.)
Code:
NAME                                            STATE     READ WRITE CKSUM
        pool                                            ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/b84ae8a3-0e68-11ed-959c-14dae9124c74  ONLINE       0     0     0
            gptid/0b6978e4-3245-11ee-aa37-14dae9124c74  ONLINE       0     0     0
            gptid/fd3b142d-b916-11ec-85a6-14dae9124c74  ONLINE       0     0     0
            gptid/619b459e-a107-11ee-a9a5-14dae9124c74  ONLINE       0     0     0
            gptid/02329014-a5a6-11ee-a1ae-14dae9124c74  ONLINE       0     0     1
            gptid/3e962c04-0461-11ed-9bd2-14dae9124c74  ONLINE       0     0     0
            gptid/e4935977-a7f7-11ee-a1ae-14dae9124c74  ONLINE       0     0     0
            gptid/24a59418-0616-11ed-af63-14dae9124c74  ONLINE       0     0     0


then, if I run glabel status:
Code:
                                      Name  Status  Components
gptid/4e05a24c-cb1b-11ec-978e-14dae9124c74     N/A  da5p1
gptid/30ffa193-cb1c-11ec-978e-14dae9124c74     N/A  da4p1
gptid/fd3b142d-b916-11ec-85a6-14dae9124c74     N/A  da0p2
gptid/b84ae8a3-0e68-11ed-959c-14dae9124c74     N/A  da8p2
gptid/3e962c04-0461-11ed-9bd2-14dae9124c74     N/A  da6p2
gptid/e4935977-a7f7-11ee-a1ae-14dae9124c74     N/A  da9p2
gptid/24a59418-0616-11ed-af63-14dae9124c74     N/A  da7p2
gptid/0b6978e4-3245-11ee-aa37-14dae9124c74     N/A  da2p2
gptid/02329014-a5a6-11ee-a1ae-14dae9124c74     N/A  da3p2
gptid/619b459e-a107-11ee-a9a5-14dae9124c74     N/A  da1p2
gptid/eefe8dd3-83bd-11ee-808c-14dae9124c74     N/A  da10p2
gptid/ffd605ca-835e-11ee-808c-14dae9124c74     N/A  da11p2
gptid/1edc272a-80da-11ee-808c-14dae9124c74     N/A  da12p2
gptid/bff1f9fb-80ea-11ee-808c-14dae9124c74     N/A  da13p2
gptid/3108ae9b-cb1c-11ec-978e-14dae9124c74     N/A  da4p3
gptid/4e0dfb80-cb1b-11ec-978e-14dae9124c74     N/A  da5p3


which means that da3 is the offending drive. BUT(!) da3 is WX*****2TL6K (NOT WD-WX*****47Y17).

I guess I should just replace WX*****2TL6K and assume that there was a hiccup in the original email
 
Joined
Oct 22, 2019
Messages
3,641
I guess I should just replace WX*****2TL6K and assume that there was a hiccup in the original email
That is odd that it produced an alert for a serial number that does not even exist in your system.

I would at least run a short SMART selftest (and perhaps later extended SMART selftest) on the drive in question.

A scrub won't hurt either. It may in fact pass and then you can safely clear the single error from the pool's status log.
 
Top