Fail event had been detected on md device /dev/md127

stephan44

Cadet
Joined
Dec 29, 2021
Messages
5
bug in scale 22.12.0?
my truenas scale is running under proxmox ve (7.3-3) with virtio-scsi drives. I upgraded from core some months ago to scale 22.02.4 without any problems. now I got 22.12.0 and get the following message on every reboot:

A Fail event had been detected on md device /dev/md127.

It could be related to component device /dev/sdc1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md127 : active raid1 sda1[1] sdc1[0](F)
2095040 blocks super 1.2 [2/1] [_U]
[=>...................] resync = 9.6% (201600/2095040) finish=0.1min speed=201600K/sec

every reboot gives me this message again
cat /proc/mdstat after starting up does not show this problem:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>

zpool gives no errors and all pools are up to date
switching back to 22.02.4 does not generate this message during startup.
Any hints what to do (running 22.02.4 again)/anyone with similar problems?



update december 26:
problem seems to be solved - thanks a lot to the community
in summary this never showed before in Angelfish but became visible in Bluefish due to kernel updates
in a future version this will not be shown anymore:
# mdmonitor.service in particular causes mdadm to send emails to the MAILADDR line
# in /etc/mdadm/mdadm.conf. By default, that's the root account, so end-users are
# getting unnecessary emails about these devices. Since middlewared service is
# solely responsible for managing md devices there is no reason to run this monitor
# service. This prevents unnecessary emails from being sent out.
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Bug or not, you're set up for failure with those virtual disks.

Please see here:

You're also using the lesser of the 2 options for virtualizing, so be aware that your experience may not be great even with an HBA in passthrough and direct disk access.
 

Paddy0293

Dabbler
Joined
Sep 28, 2022
Messages
35
Hello hope anyone can help me, i tried to upgrade to bluefin and got this error, when i switched back to anglefish no errors ? anyone know the bug ? is this cirtitical if i stay at bluefin ?

I dont unsterstand this errror with bluefin ;(
Code:
This is an automatically generated mail message from mdadm
running on truenas

A Fail event had been detected on md device /dev/md127.

It could be related to component device /dev/sdc1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md127 : active raid1 sdc1[1](F) sda1[0]
2095040 blocks super 1.2 [2/1] [U_]
[>....................] resync = 2.8% (59136/2095040) finish=0.5min speed=59136K/sec

unused devices: <none>
 

snave

Dabbler
Joined
Dec 31, 2020
Messages
13
I'm running TrueNAS-SCALE-22.12.0 directly on AMD B450 hardware with an LSI HBA in IT mode. I get very similar messages at startup only since the Bluefin upgrade. Interestingly I have another similar server (only different AMD chipset/cpu, also zfs boot volume mirrored) where I don't get these messages.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
you didn't follow the forum rules, please read those first.

but I can't help but wonder why it looks like you are using linux software RAID (mdadm)...?
TrueNAS uses ZFS. this error is not from ZFS, which is very curious.
 

Daisuke

Contributor
Joined
Jun 23, 2011
Messages
1,041
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
A newer kernel version is used in Bluefin, which produces useful system related warnings. See this thread for warnings related to badly formatted disks that made surface in Bluefin, for example.
Appears to be a known bug
That's not a bug. What iX System does is masking the warning message, but the problem still remains. It is your job to make sure you use proper hardware, like @sretalla mentioned.
 
Last edited:

Paddy0293

Dabbler
Joined
Sep 28, 2022
Messages
35
@Daisuke what you mean with "badly formatted Disk" ?

I installed truenas scale directly on my Supermicro Hardware ( let me know, If you need the Specs) with 2x WD Red pros in mirror.
I dont Unserstand what its wrong or how to fix my Problem

Merry Christmas:)
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
huh. I managed to get this alert on my test scale system. very interesting. (one of the s&^t drives went AWOL)
 

Paddy0293

Dabbler
Joined
Sep 28, 2022
Messages
35
Okay i understand now the problem,
what happens if i don't fix it ?
It is possible Linux support sometime the 520 Byte sector?

Thanks for the Procedere @Daisuke, i will try after Xmas :)
 

Paddy0293

Dabbler
Joined
Sep 28, 2022
Messages
35
@Daisuke hope u can help me again, i forgot my issue :(
I found this, in the workarounds
This is causing confusing (and unnecessarily alarming) email messages to users. Middlewared process is solely responsible for managing md devices, so disable these services so unnecessary emails aren't sent.
Link

Can you explain me why is a unnecessarily alarming ?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
Can you explain me why is a unnecessarily alarming ?
because it's just swap space being removed. the mdadm alerts on their own indicate nothing important. if there was a disk problem, you would also be getting zfs/smart alerts about it.
the mdadm alerts appear to be generated on things like startup and shutdown, when the alert subsystem detects that the swap partitions are not online just yet or have been destroyed just before shutting the system down, indicating that the detection is too sensitive.
 
Last edited:
Top