Fail event on /dev/md127

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
I got a mail from my truenas server containing the following:
A Fail event had been detected on md device /dev/md127.

It could be related to component device /dev/sdd1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md127 : active raid1 sde1[1] sdd1[0](F)
2097152 blocks super non-persistent [2/1] [_U]

unused devices: <none>
When I went to check I saw one of my drives was missing, what would I need to do? Is there any way to revive it?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I didnt use a raid controller, I didnt even use an HBA. The drive is plugged directly into the MB sata port
Your system doesn't agree with you...
md127 : active raid1 sde1[1] sdd1[0](F)

Perhaps your system has some kind of onboard SATA RAID?

Since you gave no hardware details in your post, there's no further advice I can give at this point other than "do your research".
 

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
Your system doesn't agree with you...
md127 : active raid1 sde1[1] sdd1[0](F)

Perhaps your system has some kind of onboard SATA RAID?

Since you gave no hardware details in your post, there's no further advice I can give at this point other than "do your research".
I think the active raid1 it's referencing is another pool that's healthy. I got truenas running in a VM inside of proxmox. I have some chinese motherboard with dual Xeon-E5 2660 v2. I Also have 64GB of ram which truenas has access of 16GB. Could you maybe "translate" the mail for me. I dont really understand what any of it means, and it would help a lot with troubleshooting.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Please post your full spec, see my or @sretalla 's post for examples. Be very specific about how the disks are configured and handled by proxmox
also the output (in code blocks) of
glabel status
zpool status -v
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Could you maybe "translate" the mail for me. I dont really understand what any of it means, and it would help a lot with troubleshooting.
Still don't know much about your hardware and with Proxmox now in the way (and little idea about how you configured that), I'm not going to be much help, but I can try to break down what I think is the pertinent line:

md127 : active raid1 sde1[1] sdd1[0](F)

md127 : - Seems to be a RAID controller driver of some sort.

active raid1 - Seems to indicate you have a RAID 1 volume configured on that RAID controller

sde1[1] - Seems to be saying the first member disk/partition of the RAID1 is sde1 (assigned position [1])

sdd1[0](F) - Seems to be saying the second member disk/partition of the RAID1 is sdd1 (assigned position [0])

(F) - Seems to indicate the disk is Failed Since it's butted against sdd1, I assume it's that disk which failed.
 
Last edited:

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
Ok, so my specs are as follows:
CASE: Sharkoon NIGHT SHARK
MB: HUANANZHI x79 Dual-8D
CPU: 2xE5-2660
RAM: 8*8GB DDR3 Samsung ECC Reg memory
Pool 1: 4x1tb velociraptor in a Raid-Z1
Pool 2: 1x1TB Seasonic disk + 1x3TB Seasonic disk in Raid-0 (only games are stored on here so data loss does not matter)
OS: Latest Truenas Scale
PSU: Gigabyte GP-P850GM
Boot: 16GB virtiual disk from proxmox

Truenas is running in a vm inside of proxmox, it has access to 16GB of memory and 12 Threads all disks are connected to an HBA which is passed through to TrueNAS (https://www.amazon.com/Rivo-Controller-Expansion-Profile-Non-Raid/dp/B092TCNKDG/) I think this is the HBA.

The drive that failed is the 3TB one, it's also not showing up in the disks menu of truenas. My question is not how I would recover the data, since I know I can't because of the raid-0. My question is: Is there anything I can try to revive the disk. Or should I just replace it with a 3x2TB raid-z1 to prevent data loss in the feature?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
That's not an HBA...




MB: HUANANZHI x79 Dual-8D
That board has an Realtek NIC... not a great choice for good performance with TrueNAS

It's also possible that the Intel x79 chipset which is behind the SATA might be providing RAID functionality.
 
Last edited:

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
That's not an HBA...





That board has an Realtek NIC... not a great choice for good performance with TrueNAS

It's also possible that the Intel x79 chipset which is behind the SATA might be providing RAID functionality.
I'm not sure if it is the exact same, are there any commands I can use to check if I have an HBA or if I need to replace it ;).
I also just rebooted my system (for the third time) and now my drive is showing up again.
 

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
That's not an HBA...





That board has an Realtek NIC... not a great choice for good performance with TrueNAS

It's also possible that the Intel x79 chipset which is behind the SATA might be providing RAID functionality.
I'm not sure if it is the exact same, are there any commands I can use to check if I have an HBA or if I need to replace it ;).
I also just rebooted my system (for the third time) and now my drive is showing up again.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Are you running TrueNAS SCALE or CORE?

If you're running CORE, that message in your first post is most likely from your Proxmox server, not TrueNAS. Maybe that would help to untangle some of the mess that we have here.

are there any commands I can use to check if I have an HBA or if I need to replace it
There's nothing needed... it's not an HBA (don't know of any with a PCIe x 1 connector... and certainly wouldn't buy one if I found one... see the linked post).
 

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
Are you running TrueNAS SCALE or CORE?

If you're running CORE, that message in your first post is most likely from your Proxmox server, not TrueNAS. Maybe that would help to untangle some of the mess that we have here.


There's nothing needed... it's not an HBA (don't know of any with a PCIe x 1 connector... and certainly wouldn't buy one if I found one... see the linked post).
I just upgraded to truenas SCALE a few weeks ago, so I'm certain its SCALE. Well, then I atleast know I will have to replace my "HBA" I look into some LSI ones.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Get LSI cards from a system dismantler rather than from China - less likely to be fakes
 

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
Get LSI cards from a system dismantler rather than from China - less likely to be fakes
Yeah, in the EU they aren't as cheap as in the US. But I did find an Dell H200 SAS HBA | LSI 9211-8i for €55,-. You think that would be a good option?
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Tell me about it.
But yes - with IT mode firmware should be good. Its good for HDD's, not so hot for SSD's
 

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
Ok, thanks. So the drive seems to be working fine now. I ran a few smart tests and they all pass with no error's. Should I replace the drive, or was a faulty cable the issue?
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Well you are using a RAID controller - which is bad - thats needs replacing.
Other than that we don't actually know as you haven't posted anything that might tell us
 

ezelbanaan

Dabbler
Joined
Feb 2, 2020
Messages
29
Well you are using a RAID controller - which is bad - thats needs replacing.
Other than that we don't actually know as you haven't posted anything that might tell us
Yeah, I'm definitely ordering that LSI card (As it comes with the right firmware for TrueNAS). Is there anything I can post to help you diagnose the drive?
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
in code blocks please
1. zpool status -v
2. glabel status
 
Top