'Disk removed by administrator' 2 minutes later alert is cleared

bmf614z · Jun 21, 2022

Hello,

I noticed that since I installed the latest update stable (12.0) I have been getting emails like this:

ew alerts:
* Pool Pool state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:
Disk ATA ST4000NM0033-9ZM Z1Z6EDSM is REMOVED

Current alerts:
* Pool Pool state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:
Disk ATA ST4000NM0033-9ZM Z1Z6EDSM is REMOVED

then 2 minutes later it says this:

The following alert has been cleared:

* Pool Pool state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:
Disk ATA ST4000NM0033-9ZM Z1Z6EDSM is REMOVED

It concerns me because it says it was removed by the administrator, and not that it failed or whatever. Is this a security thing where someone is offlining my disks remotely?

sretalla · Jun 21, 2022

That's the same disk each time... time to run smartctl -a /dev/daX on it and/or check dmesg to see what CAM messages are telling you.

I suspect something is on it's way to death.

bmf614z said:
It concerns me because it says it was removed by the administrator, and not that it failed or whatever. Is this a security thing where someone is offlining my disks remotely?

Not at all... the system itself (middleware) is doing that as root, so seems like the "administrator" did it.

Pabs · Jul 16, 2022

I have been experiencing the same issue, were you able to solve it? if so, how?

sretalla · Jul 16, 2022

How are your disks connected?

What hardware are you using?

Pabs · Jul 16, 2022

sretalla said:
How are your disks connected?

What hardware are you using?

Connected Via LSI 9211-8i 6G SAS HBA FW:P20 IT Mode

WD 16TB Enterprise Drives, SuperMicro MOBO, would you need specifics for those?

Thx

sretalla · Jul 18, 2022

Pabs said:
would you need specifics for those?

Forum rules, etc. but not really.
What would be interesting to know is airflow and temperature... that sounds like an overheating HBA to me.

Pabs · Jul 18, 2022

sretalla said:
Forum rules, etc. but not really.
What would be interesting to know is airflow and temperature... that sounds like an overheating HBA to me.

But wouldn't the alerts within TrueNAS keep a record of this issue?
Thx

sretalla · Jul 19, 2022

Pabs said:
But wouldn't the alerts within TrueNAS keep a record of this issue?

Alerts isn't a record keeping system.

You may find in /var/log/messages (or one of the bzipped archives of it) that you have some clues there.

Pabs · Jul 20, 2022

Thank you!!

ChrisRJ · Sep 19, 2022

Yesterday, I experienced the same symptom (for hardware details see signature). The corresponding part of /var/log/messages shows the following:

Code:

Sep 19 21:46:54 nas3 isci: 1663616814:544693 ISCI isci: bus=0 target=3 lun=0 cdb[0]=359fc1f0 terminated
Sep 19 21:46:54 nas3 (da3:isci0:0:3:0): READ(16). CDB: 88 00 00 00 00 01 19 0d 8f 10 00 00 00 10 00 00
Sep 19 21:46:54 nas3 (da3:isci0:0:3:0): CAM status: CCB request terminated by the host
Sep 19 21:46:54 nas3 (da3:isci0:0:3:0): Retrying command, 3 more tries remain
Sep 19 21:46:54 nas3 (da3:isci0:0:3:0): Invalidating pack
Sep 19 21:46:54 nas3 GEOM_ELI: g_eli_read_done() failed (error=6) gptid/???.eli[READ(offset=2412079030272, length=8192)]
Sep 19 21:46:54 nas3 da3 at isci0 bus 0 scbus0 target 3 lun 0
Sep 19 21:46:54 nas3 da3: <ATA ST16000NM001G-2K SN03>  s/n ??? detached
Sep 19 21:46:54 nas3 GEOM_ELI: g_eli_read_done() failed (error=6) gptid/???.eli[READ(offset=270336, length=8192)]
Sep 19 21:46:54 nas3 GEOM_ELI: g_eli_read_done() failed (error=6) gptid/???.eli[READ(offset=15998752399360, length=8192)]
Sep 19 21:46:54 nas3 GEOM_ELI: g_eli_read_done() failed (error=6) gptid/0b195f5a-1ad0-11eb-b5e9-0cc47a052e3c.eli[READ(offset=15998752661504, length=8192)]
Sep 19 21:46:54 nas3 GEOM_MIRROR: Device swap0: provider da3p1 disconnected.
Sep 19 21:46:55 nas3 GEOM_ELI: Device gptid/???.eli destroyed.
Sep 19 21:46:55 nas3 GEOM_ELI: Detached gptid/???.eli on last close.
Sep 19 21:46:55 nas3 (da3:isci0:0:3:0): Periph destroyed
Sep 19 21:49:58 nas3 da3 at isci0 bus 0 scbus0 target 3 lun 0
Sep 19 21:49:58 nas3 da3: <ATA ST16000NM001G-2K SN03> Fixed Direct Access SPC-3 SCSI device
Sep 19 21:49:58 nas3 da3: Serial Number ????
Sep 19 21:49:58 nas3 da3: 300.000MB/s transfers
Sep 19 21:49:58 nas3 da3: Command Queueing enabled
Sep 19 21:49:58 nas3 da3: 15259648MB (31251759104 512 byte sectors)
Sep 19 21:49:58 nas3 GEOM_MIRROR: Device mirror/swap2 launched (3/3).
Sep 19 21:49:59 nas3 GEOM_ELI: Device mirror/swap2.eli created.
Sep 19 21:49:59 nas3 GEOM_ELI: Encryption: AES-XTS 128
Sep 19 21:49:59 nas3 GEOM_ELI:     Crypto: hardware

I interpreted this is as a "glitch" in the communication between motherboard (where the drive is connected) and the drive and after a reboot this morning, things appear(!) to be ok again.

SMART reports no errors. I have initiated an extra scrub and await the results. Will report back.

Important Announcement for the TrueNAS Community.

'Disk removed by administrator' 2 minutes later alert is cleared

bmf614z

Dabbler

sretalla

Powered by Neutrality

Pabs

Explorer

sretalla

Powered by Neutrality

Pabs

Explorer

sretalla

Powered by Neutrality

Pabs

Explorer

sretalla

Powered by Neutrality

Pabs

Explorer

ChrisRJ

Wizard

Similar threads

Important Announcement for the TrueNAS Community.

'Disk removed by administrator' 2 minutes later alert is cleared

Dabbler

Powered by Neutrality

Explorer

Powered by Neutrality

Explorer

Powered by Neutrality

Explorer

Powered by Neutrality

Explorer

Wizard

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "'Disk removed by administrator' 2 minutes later alert is cleared"

Similar threads