SOLVED Degraded disk help

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Just wanted to make sure. You shouldn't experience any issues from now on, hardware wise... just take note of the drive you connected to the port multiplier as it's the most likely to cause issues.

If you haven't already scheduled frequent smart tests and scrubs, do so; scripts such as multi_report can help you monitor your drives and your pools.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Wasnt a cheap setup :( -
With brand new DDR5 RAM and Zen4 CPU I can believe it. Yet the build could have been both cheaper and more reliable with an older CPU, an older motherboard with at least 6 SATA ports to match your SATA bays, and second-hand DDR4 ECC RAM.

and i bottlenecked it like a moron with that garbage controller
And we would have found the solution earlier if you had just listed your hardware to begin with.
"Gigabyte B650I with Ryzen 7600X in Jonsbo N1 case" (or N2?)
Even without mention of the crappy controller, 5 drives on a 4-port motherboard was a give-away there was something fishy…

By the way, my nod to LTT was not meant as endorsement or praise—far, very far, from it.

Hopefully your issue may be closed with no data loss and no permanent damage. But that could have been a close call.
Losing even a single drive from a raidz1 is a dangerous situation, and large drives only means that there's potentially a lot of data at risk. Raidz2 is highly recommended.
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
You should either use the ports on your motherboard or an HBA, but you could try using a SATA controller (not a port multiplier).

Good news is your drives are healthy.


I believe it would solve the issue as long as you use only a single port on that port multiplier.

Do the port switching, then zpool clear, replace the external USB drive with the old (internal) drive and run a scrub. You should be fine.


1) physically moved all the devices to the mobo as instructed :

1690050246521.png


2) moved the former external enclosure drive that I mistakenly used to replace a degraded drive marked #03 HDD in pictures before
and placed the degraded #03 drive into the enclosure

1690050458833.png
1690050472644.png


3) booted up to 2 "Unassigned disks" in the dashboard and (check update below there is only one now)

1690049665166.png


4) ran sudo zpool clear CHUNK

1690050172391.png


As you can see it does not appear to like it when you use a usb external enclosure to run a zpool replace a HDD as it marks it UNAVAIL
having said this - 4 drives are online and CHUNK is accessible, shouldnt I be able get rid of this UNAVAIL, add a disk to the pool and call it a day?

*UPDATE* this UNAVAIL is the external (former replaced drive
1690051939981.png


Ive since turned the external off and its still showing up

1690052006759.png


can i just replace it with the 1 unassigned disk i have?

1690052058990.png

(no, i have not hit replace disk, just showing for clarity , wont do anything without advice)
 
Last edited:

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
With brand new DDR5 RAM and Zen4 CPU I can believe it. Yet the build could have been both cheaper and more reliable with an older CPU, an older motherboard with at least 6 SATA ports to match your SATA bays, and second-hand DDR4 ECC RAM.


And we would have found the solution earlier if you had just listed your hardware to begin with.
"Gigabyte B650I with Ryzen 7600X in Jonsbo N1 case" (or N2?)
Even without mention of the crappy controller, 5 drives on a 4-port motherboard was a give-away there was something fishy…

By the way, my nod to LTT was not meant as endorsement or praise—far, very far, from it.

Hopefully your issue may be closed with no data loss and no permanent damage. But that could have been a close call.
Losing even a single drive from a raidz1 is a dangerous situation, and large drives only means that there's potentially a lot of data at risk. Raidz2 is highly recommended.
Yeah im sorry about that - i messed up and delayed my troubleshooting help by not including the info.

I figure you lot are annoyed by LTT folk - no problem - i would be too. while it brings people like me in, at least more people are using truenas as a result and ultimately it can get better support and features.

in the future ill give all the background - i have read the forum rules Davvo posted.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I might be missing something since I am from mobile, but why did you phisically swap the external drive (former sdf) with internal #3?

You want to put back in (the vdev) your original drives since they are healthy by backtracking the steps you did to replace them.

My suggestion now is to grab the serials and check which is which (sdX can change every reboot), because I have no idea which disk the one you see is (should be either of the two you swapped).
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
I might be missing something since I am from mobile, but why did you phisically swap the external drive (former sdf) with internal #3?

You want to put back in (the vdev) your original drives since they are healthy by backtracking the steps you did to replace them.
but it was replaced - i though you wanted the former sdf (new drive) there - i can put it all back but doesnt ZFS mark it in some way once youve replaced it?

its a simple swap back then - if you think its better to put the replaced drive back in
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
but it was replaced - i though you wanted the former sdf (new drive) there - i can put it all back but doesnt ZFS mark it in some way once youve replaced it?

its a simple swap back then - if you think its better to put the replaced drive back in
Assuming what you did was:
WebUI replace of one internal drive with and external one.

I wanted you to use the WebUI replace action to bring back the perfectly working drive (that was marked as degraded) in the pool so that we can detach the USB drive.

Assuming the drives numbered #1 to #5 are the original ones and the unlabelled is the external one, the one used for replacement, I want the numbered drives in the case and in the vdev.
Once they are in the case, assuming the USB drive is in the vdev and is acting as replacement of #3, replace it by the WebUI wity the number #3.
Then, we can offline and detach the USB drive, clear any errors in the pool and run a scrub.

Please tell me if I wasn't clear.
 
Last edited:

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
same situation:
1690054626900.png

1690054597977.png


everything is 100% back to the way it was hardware wise apart from the sata connections now

the external drive is no longer connected, didnt see your post above - i will add the new drive back into the enclosure now
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
HOLY MOLY - all fixed :) only i cant run an external disk - i think we're close :)

1690055065588.png


1690055120509.png


So how do i get back to bringing in sda to replace sdf?

another gui zpool replace ? sdf -> sda? (again just for visuals below - ive not done it )

1690055204888.png
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Check by the serials that you actually have the right drives in, if so you can detach the USB drive (not sure this is possibile from the WebUI, maybe you can just disconnect the USB drive) and run a scrub on the chunk pool.

Once this ends, I am interested in your temperatures.
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
ok ill check the serials but im fairly sure sdf IS the external.. lets see

1690055616173.png

yes as i suspected sdf is the new "replacer" drive that replaced what is now marked as sda in the Unassigned Disks section
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
You guys are the salt of the earth, having to deal with the likes of me. :)
1690055815816.png

Cheers guys! I think youve solved it!
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
is it not recommended to use the pool while resilvering op is running? or is it ok to run plex? (am i pushing my luck :D )
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Given your RAIDZ layout your pool performance won't be great (although your Plex app should be in the SSD, I assume your media is on the spinning rust).

Remember to run a scrub after the resilver is completed. If no errors pop up, remember to mark this thread as solved (you should get the option by editing the first message I believe).

You did good by coming here to ask, and your replacement of a faulted disk with an USB one was a good call (though not the right one).
 
Top