RAIDZ2 after disk replacment all disk degraded (need Help!)

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
Hi all

I seldom write or ask for help but this time its maybe a big issue for me. So I had a faulty disk, ONE and out of I dont know why, shut down the system pulled it out and replaced it with a new one. After I put in the new one, starte NAS and replaced the faulty disk. After 8hours I checked and my nas was not accessible anymore over web. So I checked with my i/o port and saw that it crashed. Still can access command line but zpool status shows ALL disks are degraded now. So
I restarted the nas and after about 15mins its in the same status where web isnt accessible anymore. So I thought put in the disk you pulled out where it was and plugin
the extra on a different port, but also the same problem after about 10mins.

So I have no clue now what todo and just hope with help I can save my data as I have no Backup. I use the onboard Raid controller and an extra SATA controller which also might have made the problem accure but thats not what I want to discuss. I really would like that it runs again and my data is back. Have to say that this setup runs now for about 5 years with no problem at all.

My Setup is_
Supermicro X10SLM-F (LGA 1150, Intel C224, mATX)
Kingston ValueRAM ECC-Reg (1 x 16GB, DDR3L-1600 (PC3-12800), DIMM 240 pin)
actual HD setup 8x3TB, right now 8x3TB and one 4TB as it was replacement
8 new 4TB HDs are on the way as I wanted to enlarge my RaidZ2
FreeBSD 12.2 Release P9 amd64
 

Attachments

  • NAS_error.png
    NAS_error.png
    124.8 KB · Views: 149
  • NAS_Zpool_Status.png
    NAS_Zpool_Status.png
    107.6 KB · Views: 127

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,112
Maybe you should have asked earlier and more often because your unbacked-up NAS looks worrisome. "Onboard RAID controller" (in RAID mode?), "extra SATA controller" (what?), checksum errors on the remaining online drives…
Which drive is what? What's the output of smartctl -a /dev/[a]da#?
 
Last edited:

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
Raid controller is in SATA modus. Why should I have asked earlier and more often? Not everyone has the money to have 2 Nas to backup one to the other. Its not the first time I change a failure disk just this time it seams I was somewhere else and unplugged the failing one before resilvering it with the new one. So maybe you back off with your lecturing as I already said I made a mistake.

output of you command is no matches found. And yes I have a backup of my configuration but not of my data as I use RaidZ2 out of the reason I cant afford another nas.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,112
Sorry, it's meant to be smartctl -a /dev/ada0, smartctl -a /dev/ada1… or smartctl -a /dev/da0, smartctl -a /dev/da1… for each of the drives, with the appropriate da or ada names and numbers. (Please post the output in CODE tags for readability.)
I thought you had an onboard SAS controller, which could have been an issue if in RAID mode. But the additional SATA card (which model) is possibly an issue, especially if it uses port multipliers
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
glabel status should point you to the list of disk names you need to use

It seems you have at least ada8 as one disk in the system from that last screenshot.
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
Aside from the controller issues... you really need to work that out if you want reliability into the future... you need to look at your disks.

Are they (or some of them) SMR? https://www.truenas.com/community/resources/list-of-known-smr-drives.141/
Well cant tell you if they have but the new ones are WD Red Plus. The old ones where also labled for NAS Systems. As mentioned the NAS was running fine for 5 years now and I replaced faulty disks before. IF all now are faulty would be a hughe coincident.
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
glabel status should point you to the list of disk names you need to use

It seems you have at least ada8 as one disk in the system from that last screenshot.
As far as I can tell, that does not look good, right?
 

Attachments

  • glabel_status.png
    glabel_status.png
    141 KB · Views: 129

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,112
Signature: "FreeNAS 10" (of all things…)
First post: "FreeBSD 12.2 Release P9 amd64"
Is this really FreeNAS/TrueNAS or a custom installation of FreeBSD with ZFS? I'm afraid it may not matter anyway because this system is completely falling apart.
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
Signature: "FreeNAS 10" (of all things…)
First post: "FreeBSD 12.2 Release P9 amd64"
Is this really FreeNAS/TrueNAS or a custom installation of FreeBSD with ZFS? I'm afraid it may not matter anyway because this system is completely falling apart.
Really you challenge me because in my signature still Freenas 10 is written. Which I have not updated in YEARS. I posted pictures where you can see that it is clearly 12.2. And NO its not a custom build out of freebsd and ZFS. I have not the skill to do that.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
ried smartctl again and now it tells me /dev/ada0p2
You don't specify a partition for smartctl... so smartctl -a /dev/ada0 should be enough.

The fact that you're on a RAID controller is probably the blocker there and you may need to specify switches to indicate which kind of RAID controller to work with for smartctl...

You may need to figure it out for your specific controller, but here's an example of how it might look if your adapter is a megaraid:
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
You don't specify a partition for smartctl... so smartctl -a /dev/ada0 should be enough.

The fact that you're on a RAID controller is probably the blocker there and you may need to specify switches to indicate which kind of RAID controller to work with for smartctl...

You may need to figure it out for your specific controller, but here's an example of how it might look if your adapter is a megaraid:
thanks I tried normal smartctl with ada0 and it says no such file or directory. I dont understand why suddently all of them are degraded. I was on the way to replace the faulty one and then make a new install of Truenas on an SSD. I will see what I can find out about the raid adapter later on.

I am even willing to pay someone to take a look or even tries to fix it.
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
Signature: "FreeNAS 10" (of all things…)
First post: "FreeBSD 12.2 Release P9 amd64"
Is this really FreeNAS/TrueNAS or a custom installation of FreeBSD with ZFS? I'm afraid it may not matter anyway because this system is completely falling apart.
@Etorix, you are the reason why I stopped using community boards. Instead of helping you just criticize, judge, lecture and challenge things.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
I dont understand why suddently all of them are degraded
In order to understand that, we need to see the smartctl output to know if the disks are OK or not.

To do that, you'll need to figure out the smartctl syntax as I mentioned to work through your RIAD card.

You can possibly already get the output from the drives attached to the additional SATA controller without needing the extra switches, so we can start there (we just need to know which ones they are (or you can just do smartctl -a with all drives from ada0 to 8 and see which ones work.
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
In order to understand that, we need to see the smartctl output to know if the disks are OK or not.

To do that, you'll need to figure out the smartctl syntax as I mentioned to work through your RIAD card.

You can possibly already get the output from the drives attached to the additional SATA controller without needing the extra switches, so we can start there (we just need to know which ones they are (or you can just do smartctl -a with all drives from ada0 to 8 and see which ones work.
as written I will. At the moment I actually have to work and are busy now. in about 50mins I have lunch and will check it.
Many thanks you are going true with me to maybe solve it. I know its possible I have lost everything but at least I want to try to save it.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,112
@Etorix, you are the reason why I stopped using community boards. Instead of helping you just criticize, judge, lecture and challenge things.
Sorry if it feels that way, but I'm not judging or challenging anything, just trying to understand your setup from incomplete, and sometimes contradictory or confusing information. It's only human to forget to update one's signature, but any and all information is useful. Thanks for updating.

It would help to clarify what the "Raid controller" is. I now understand it's simply the C224 chipset and its software (BIOS) RAID options, but @srelleta appears to think there's a RAID AIC in there. I'm probably pedantic, possibly a bit dumb, but may I respectfully suggest you have not always used the best terminology and have not been as clear as clear could have been?
We still do not know what the additional SATA controller is—and anything based on port multipliers would be best removed immediately. We still do not know which drives are attached to what, in case that could help retracing how things went.

you can just do smartctl -a with all drives from ada0 to 8 and see which ones work.
Please do as suggested, copying and pasting from your terminal. And, PLEASE, format this text within CODE tags (</>), as available from the "…" menu in the forum editor. (I take that one for me not having clear and explicit enough if you're not familiar with the features of the forum.)
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
Ok then you need to rephrase you questions. So far I have not provided incomplete or contradictory confusing information. What I have written was what I knew at that point. You started to let it look like that when you challanged my signature. But lets leave that aside.
I will try to find out more about the SATA controller as at the beginning I thought that this should not be the big issue. If we can manage to bring back my data I will buy this item: https://adaptec.com/de-de/products/hba1100/
That should be a much better solution as what I am using now. If we dont bring them back I will buy it anyway as I will have still need of the nas in the future.

I dont mind if you are pedantic but I try to be as clear as possible. If I left something out its not on purpose.

So give me the time to find out about the SATA controllers and we can go from there further on
 

MutoSan

Contributor
Joined
May 3, 2014
Messages
103
so I restarted my NAS and activated SSH. Reason with the other connection I cant copy things easy with putty I can.
You don't specify a partition for smartctl... so smartctl -a /dev/ada0 should be enough.

The fact that you're on a RAID controller is probably the blocker there and you may need to specify switches to indicate which kind of RAID controller to work with for smartctl...

You may need to figure it out for your specific controller, but here's an example of how it might look if your adapter is a megaraid:
Not to put you down but the side to figure out my controller does not help. That is for a Linux system and not freebsd. At least I could not even find the /sys/ folder. I will see if I find the commands to find it or the recipe when I order it.

With SSH over Putty I also can scroll up and down and see much more info. The resilvering is still going on
<
pool: freenas-boot
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: scrub repaired 0B in 00:13:09 with 0 errors on Tue Oct 19 03:58:09 2021
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
da0p2 ONLINE 0 0 0

errors: No known data errors

pool: mediaserver
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Oct 25 12:25:30 2021
1.81T scanned at 555M/s, 697G issued at 209M/s, 11.2T total
79.4G resilvered, 6.07% done, 14:41:40 to go
config:

NAME STATE READ WRITE CKSUM
mediaserver DEGRADED 0 0 0
raidz2-0 DEGRADED 1 10 0
gptid/37abec5e-8e0a-11ea-b769-0cc47a4cd4bd ONLINE 3 60 0 (resilvering)
gptid/38a5e733-8e0a-11ea-b769-0cc47a4cd4bd DEGRADED 0 0 0 too many errors
gptid/38cc47be-8e0a-11ea-b769-0cc47a4cd4bd DEGRADED 0 0 0 too many errors
gptid/38759c18-8e0a-11ea-b769-0cc47a4cd4bd DEGRADED 0 0 0 too many errors
gptid/de252c21-357d-11ec-93c0-0cc47a4cd4bd DEGRADED 3 350 0 too many errors (resilvering)
gptid/38e364e0-8e0a-11ea-b769-0cc47a4cd4bd ONLINE 2 60 0 (resilvering)
gptid/38be6b25-8e0a-11ea-b769-0cc47a4cd4bd DEGRADED 0 0 0 too many errors
gptid/390dfe83-8e0a-11ea-b769-0cc47a4cd4bd DEGRADED 0 0 0 too many errors

errors: 8842561 data errors, use '-v' for a list
/>
 
Top