SOLVED Degraded disk help

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
1689985784461.png


1689985822513.png
\

Am I screwed? I have only one spare disk .

How can I reduce the size of the pool without compromising the data?

When it says "DEGRADED" does it really mean its "offline"? Am I just operating off of sda /sdc and sdd or are they all still working ?

Can I just put out the disks that are degraded?

1689987277066.png


I can put the one spare in a USB enclosure and rebuild the array with 4 disks without data loss?

Please help
 
Last edited:

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
Would performing a

zpool replace chunk sdb sdf

then a

zpool remove chunk sde

Allow me to recover from degraded state and give me back my healthy RAIDZ1 with 4 disks? Im not sure about the remove as the example shows removing a mirror not a specific drive.

Please help


In usb enclosure:
Disk /dev/sdf: 14.55 TiB, 16000900661248 bytes, 3906469888 sectors
Disk model:
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 33550336 bytes
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
You really should be using the GUI for the replacement, as their are setup details taken care of for you.

Next, you can not remove a column disk from a RAID-Zx. It is not possible, (nor planned), to reduce a pool that has RAID-Zx vDevs.

It would be wise to perform trouble shooting before replacing the drive. You don't list your hardware, (per forum rules), so any troubleshooting we can suggest would be guesses.

Last, using an USB enclosure for a disk drive in an active ZFS pool is not a good solution:
In the short term, probably okay, but the replacement can fail because of the USB enclosure.
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
You really should be using the GUI for the replacement, as their are setup details taken care of for you.

Next, you can not remove a column disk from a RAID-Zx. It is not possible, (nor planned), to reduce a pool that has RAID-Zx vDevs.

It would be wise to perform trouble shooting before replacing the drive. You don't list your hardware, (per forum rules), so any troubleshooting we can suggest would be guesses.

Last, using an USB enclosure for a disk drive in an active ZFS pool is not a good solution:
In the short term, probably okay, but the replacement can fail because of the USB enclosure.
as you can see above im using the GUI for the first replacement, so I didnt use the command line.

i have no idea how to troubleshoot it :(

So after this replace op finishes - i will have only one degraded drive left.

I HAD to use an external as i only have 5 bays - and they were all taken by a drive... when i tried to physically remove one of the degraded drives the pool would disappear - so i left all the original drives in and added the external for replacement - as soon as it finishes - i was going to swap it with the one it replaced


did I do it wrong?

1690019299433.png
 
Last edited:

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
I assumed one could offline a degraded disk and "resilver" onto the remaining disks so long as there was the parity info that could be distributed amongst the pool...

so once this resilvering is done - can I remove the current sdb drive and put sdf into the same physical slot or will the pool act weird?
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
i have no idea what any of this jargon means - no idea what a vdev is
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
No, you can not "offline" a degraded disk, and resilver on to the remaining disks. ZFS does not allow column remove.

Yes, when the re-silver is complete, you can remove the current SDB and put in the SDF in its place. Should be fine.

A ZFS vDev is a Virtual Device inside a ZFS Pool. You must have at least one data vDev;
  • Single disk, (which implies no redundancy)
  • RAID-Zx
  • Mirror
  • DRAID
In addition, their are other vDevs for special purposes, (that the average new user does not need). But, these would be in addition to data vDevs as above.
  • Cache
  • SLOG
  • Special / Metadata
In your case, you have a single RAID-Z1 vDev in your "CHUNK" Pool, with 5 disks in that RAID-Z1.

Not sure why your pool disappeared when you removed one of the degraded disks.

Can you put the output of zpool status -v CHUNK in code tags?
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
Code:
nothing@truenas:/$ sudo zpool status -v CHUNK
  pool: CHUNK
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 3.45T in 08:02:07 with 0 errors on Sat Jul 22 14:50:45 2023
config:

        NAME                                      STATE     READ WRITE CKSUM
        CHUNK                                     DEGRADED     0     0     0
          raidz1-0                                DEGRADED     0     0     0
            b776a493-8c25-4999-988a-5570ec8afa01  ONLINE       0     0     0
            6415b76b-174e-48ae-9d72-4a0e1dc82ece  ONLINE       0     0     0
            6318b353-b38b-4914-be7a-8c99f90f6063  ONLINE       0     0     0
            287a1658-b6a0-4c11-b486-10973def5afd  DEGRADED     0     0     0  too many errors
            3538601d-ce05-47ff-8465-9cc978c23fbe  ONLINE       0     0     0

errors: No known data errors
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
i dont have another disk but i can put the first replaced disk in the usb enclosure for troubleshooting purposes
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
sorry it didnt disappear but all the disks went into the Add to pool count after i removed the degraded disk (pre replace operation).. but maybe i got that wrong. Sorry im a newbie with truenas and zfs
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

You should really stop doing things as you feel it and ask of you don't know.
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59

You should really stop doing things as you feel it and ask of you don't know.
I did ask but it took a while to get a reply.

Anyway. I read a lot about zfs before posting which is how I found the zpool commands.

I setup the system just fine not knowing what I was doing and I guess that gave me confidence to resolve. :D also I have several years of Linux experience so there's a bit of pride at play too I want to solve ASAP. It's my modus operandi

The last action I took was ok'd above by Arwen.
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
If I put the replaced drive (former sdb) into the enclosure and the new drive (former sdf) in the former sdb's slot.. it becames UNAVAIL somehow. :(

I've put everything back to the way it was new drive back in enclosure and degraded replaced drive sdb back into it's slot and now everything is there again. So odd

Current state:
1690028980915.png
 
Last edited:

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
Looks like just putting the newly replacer drive into the replaced drives "bay" might not always work

1690035200570.png

I guess this is why it was marked "UNAVAIL" when i swapped them after replace/re-silver op
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
Running a scrub now to clean up the status of the ZFS as its red. However I still have one degraded disk and one replaced disk i need to sort out..

1) how can I get sdf (usb enclosure) into sdb's bay (already replaced and shows up in ADD TO POOL section) without breaking the CHUNK pool?

2) How can I troubleshoot the degraded disk sdc
1690035489531.png
 

nothing

Explorer
Joined
Jun 4, 2023
Messages
59
All of the syslog entries for SDC - am I panicking for no reason?

I do see some errors :(

Can I simply zpool clear CHUNK sdc ?

Code:
nothing@truenas:/var/log$ sudo grep sdc syslog |grep -i err
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002023832 op 0x0:(READ) flags 0x700 phys_seg 2 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002060968 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 17319470272 op 0x0:(READ) flags 0x700 phys_seg 32 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 17319472512 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002023904 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 14037996648 op 0x0:(READ) flags 0x700 phys_seg 32 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 27975287360 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002089952 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 17319472320 op 0x0:(READ) flags 0x700 phys_seg 3 prio class 0
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
i have no idea how to troubleshoot it :(
Let's start with a description of your setup…
We're especially interested in knowing if this is bare metal or virtualised (danger ahead!) and to what controller the drives are attached. Also the model of the drives, but we'll know it from the next step, namely the output of smartctl -a /dev/sdX, for each drive (and nicely presented within CODE tags for readability, please).
CODE_tag.png

I setup the system just fine not knowing what I was doing and I guess that gave me confidence to resolve. :D also I have several years of Linux experience so there's a bit of pride at play too I want to solve ASAP. It's my modus operandi
"Hold my beer!" LTT style? This is a great, proven, way of losing data.
If the content of this pool is of any relevance to you, begin by making a backup before attempting any change which may not be reversed.
 
Top