Pool degraded help

djdwosk97

Patron
Joined
Jun 12, 2015
Messages
382
I received an error that my pool had degraded.
freenas:~ # zpool status


pool: VolZ2


state: DEGRADED


status: One or more devices has been removed by the administrator.


Sufficient replicas exist for the pool to continue functioning in a


degraded state.


action: Online the device using 'zpool online' or replace the device with


'zpool replace'.


scan: scrub repaired 0 in 0 days 12:21:40 with 0 errors on Sun Apr 30 12:21:42 2023

I went into the FreeNAS dashboard and Storage > Pools, I see that there is a drive labeled as "REMOVED", and this drive does not appear under Storage > Disks.

I checked the SMART data for the remaining drives in the pool and they are all fine, but I'm not sure if I can check the SMART status of the removed drive without re-adding it first, but I wanted to confirm how I should go about re-adding it.

I'm running FreeNAS-11.3-U4.1




Secondarily, I also see that my boot device is also in a DEGRADED state, that I seemingly never received a notification for:
pool: freenas-boot


state: DEGRADED


status: One or more devices could not be opened. Sufficient replicas exist for


the pool to continue functioning in a degraded state.


action: Attach the missing device and online it using 'zpool online'.


see: http://illumos.org/msg/ZFS-8000-2Q


scan: scrub repaired 0 in 0 days 00:04:00 with 0 errors on Sat May 27 03:49:02 2023


config:





NAME STATE READ WRITE CKSUM


freenas-boot DEGRADED 0 0 0


mirror-0 DEGRADED 0 0 0


808497652944125481 UNAVAIL 164 1.68K 984 was /dev/gptid/930a3aa0-dc0e-11e5-af2a-002590d85eb3


gptid/93305624-dc0e-11e5-af2a-002590d85eb3 ONLINE 0 0 0



errors: No known data errors
The boot drive is a pair of flash drives, so I might just need to swap out one of the flash drives? What would be the best way to deal with this too?
 
Joined
Jan 7, 2015
Messages
1,155
Yes make sure you have good config backups and such, add a new usb and use the GUI to trigger the replacement.
 

djdwosk97

Patron
Joined
Jun 12, 2015
Messages
382
Yes make sure you have good config backups and such, add a new usb and use the GUI to trigger the replacement.
How do I trigger the replacement for the USB-boot drive?

And what is the best way to run a SMART test on the "REMOVED" drive? I didn't get any SMART errors before it was dropped, so I'm wondering if maybe it's a SATA-port issue.
 

djdwosk97

Patron
Joined
Jun 12, 2015
Messages
382
Yes make sure you have good config backups and such, add a new usb and use the GUI to trigger the replacement.
How do I trigger the replacement for the USB-boot drive?

And what is the best way to run a SMART test on the "REMOVED" drive? I didn't get any SMART errors before it was dropped, so I'm wondering if maybe it's a SATA-port issue.
I just wanted to confirm the best way to handle this before I accidentally make a mistake and cause myself more trouble.
 
Joined
Oct 22, 2019
Messages
3,641
How do I trigger the replacement for the USB-boot drive?
System → Boot → Actions → Boot Pool Status → then select "Replace" on the missing drive in question


And what is the best way to run a SMART test on the "REMOVED" drive? I didn't get any SMART errors before it was dropped, so I'm wondering if maybe it's a SATA-port issue.
If the system does not detect the drive, then you cannot run any type of test on it, SMART, ZFS or otherwise. It could be a failed drive or a cable/port issue.

To replace the missing drive in your data pool, you'd go to Storage → Pools → cogwheel for your pool → Status → then "Replace" the drive in question


The above assumes you've already offlined the drive(s) from your vdevs and physically removed them from the server, in which you've installed new drives.
 
Last edited:
Joined
Jan 7, 2015
Messages
1,155
This is all well defined in the manual and covered extensively in this forum. If it was me id just do a clean install with the same TN version on two new usb disks (if for whatever reason you cant use small SSDs) and restore your config. The reason being if one USB is dead then the other is likely not far from failing either. Plus you can get a satchel of them for a few dollars.

Its no longer recommend to use USB for booting mainly because one day they just die where an SSD will likely last many years and also likely give warnings when its dying. In all cases config backup is the must.
 

djdwosk97

Patron
Joined
Jun 12, 2015
Messages
382
System → Boot → Actions → Boot Pool Status → then select "Replace" on the missing drive in question



If the system does not detect the drive, then you cannot run any type of test on it, SMART, ZFS or otherwise. It could be a failed drive or a cable/port issue.

To replace the missing drive in your data pool, you'd go to Storage → Pools → cogwheel for your pool → Status → then "Replace" the drive in question


The above assumes you've already offlined the drive(s) from your vdevs and physically removed them from the server, in which you've installed new drives.
Would there be any harm with simply rebooting the system and seeing if the drive comes back online and then running a SMART test on the drive? (And then switching SATA ports assuming the SMART tests come back clean?)

I have not yet offlined the drives. I've thus far done nothing other than look at the status of pools/drives.
This is all well defined in the manual and covered extensively in this forum. If it was me id just do a clean install with the same TN version on two new usb disks (if for whatever reason you cant use small SSDs) and restore your config. The reason being if one USB is dead then the other is likely not far from failing either. Plus you can get a satchel of them for a few dollars.

Its no longer recommend to use USB for booting mainly because one day they just die where an SSD will likely last many years and also likely give warnings when its dying. In all cases config backup is the must.
I'm not overly worried about the USB boot drives failing since there is the redundant backup and I do have a config backup, and I am out of SATA ports anyway, so USB flash drives seemed easier than USB SSDs or a SATA expansion card.
 
Joined
Oct 22, 2019
Messages
3,641
Would there be any harm with simply rebooting the system and seeing if the drive comes back online and then running a SMART test on the drive? (And then switching SATA ports assuming the SMART tests come back clean?)
You can do that too. But you can still "offline" the drive from the pool while you do the SMART tests and check to see if the physical drive is being detected by the server.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
I'm not overly worried about the USB boot drives failing since there is the redundant backup and I do have a config backup, and I am out of SATA ports anyway, so USB flash drives seemed easier than USB SSDs or a SATA expansion card.
Boot drives don't even need redundancy or backup really as long as you have the config file backed up. Everything on it can be "regenerated" with a simple reinstall -> restore config. A process which takes 5-10 mins at most.

The drive in using a USB SSD as opposed to flash drive is exactly what you're experiencing. It's not that you can't use flash drives, it's just discouraged because they tend to die fast. An SSD typically will last many times longer so you don't have to worry about replacing it. I myself am still running on a single 60GB SSD from like 10+ years ago when they still made those small sizes.
 
Joined
Jan 7, 2015
Messages
1,155
Updating took forever on USB boot disks too. I found 16GB SSDs on eBay been running them for many years.

But now this post has me wondering if you can use an SSD in a USB3 enclosure and boot from it? I never tried it before. Might be the stuff.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
if you can use an SSD in a USB3 enclosure and boot from it? I never tried it before. Might be the stuff.
Absolutely it can and has been recommended by many here.

Obviously there are different enclosures and not all are great, but compared to USB sticks, you can't really go wrong to be an improvement.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Updating took forever on USB boot disks too. I found 16GB SSDs on eBay been running them for many years.

But now this post has me wondering if you can use an SSD in a USB3 enclosure and boot from it? I never tried it before. Might be the stuff.
Yes, that's because USB sticks have very slow write speeds (even if it was USB3). A real SSD in a USB3 enclosure will have much faster speeds.
 

djdwosk97

Patron
Joined
Jun 12, 2015
Messages
382
You can do that too. But you can still "offline" the drive from the pool while you do the SMART tests and check to see if the physical drive is being detected by the server.
So I did an Offline > Restart > Online and the drive still wouldn't register as connected. Then I shut down the server and when I restarted it I tried again just for the hell of it and the drive is now registering as connected and the SMART status came back clean, but the zpool status shows an error still. Should I be replacing this drive or could the error have been caused by the SATA cable/port if the SMART results are clean? (I didn't yet run the long SMART test)
root@freenas:~ # zpool status
pool: VolZ2
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: resilvered 682M in 0 days 00:00:49 with 0 errors on Thu Jun 8 20:36:02 2023

config:
NAME STATE READ WRITE CKSUM
VolZ2 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gptid/b3ab1b39-0042-11eb-b25c-002590d85eb3 ONLINE 0 0 0
gptid/b3b9befa-0042-11eb-b25c-002590d85eb3 ONLINE 0 0 0
gptid/b3abd75d-0042-11eb-b25c-002590d85eb3 ONLINE 0 0 0
gptid/b3c80ea3-0042-11eb-b25c-002590d85eb3 ONLINE 0 0 0
gptid/b3cf8e88-0042-11eb-b25c-002590d85eb3 ONLINE 0 0 1
errors: No known data errors

Also, the DEGRADED boot drive also is now showing as being online and re-slivering.
pool: freenas-boot
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Thu Jun 8 20:32:46 2023
3.96G scanned at 6.88M/s, 1.18G issued at 2.06M/s, 3.96G total
1.18G resilvered, 29.94% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/930a3aa0-dc0e-11e5-af2a-002590d85eb3 ONLINE 0 0 10
gptid/93305624-dc0e-11e5-af2a-002590d85eb3 ONLINE 0 0 0
errors: No known data errors
EDIT: The boot pool status now also changed to include reference to having encountered an unrecoverable error.
Boot pool status is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected..
 
Last edited:
Joined
Jan 7, 2015
Messages
1,155
Yes, that's because USB sticks have very slow write speeds (even if it was USB3). A real SSD in a USB3 enclosure will have much faster speeds.
Exactly I just haven't had a need to try it.
 
Joined
Jan 7, 2015
Messages
1,155
So I did an Offline > Restart > Online and the drive still wouldn't register as connected. Then I shut down the server and when I restarted it I tried again just for the hell of it and the drive is now registering as connected and the SMART status came back clean, but the zpool status shows an error still. Should I be replacing this drive or could the error have been caused by the SATA cable/port if the SMART results are clean? (I didn't yet run the long SMART test)
You can clear the errors if you think it was a cabling issue. But if and when any errors return id get another disk on order or an RMA.
 
Top