SOLVED Degraded disk help

nothing · Jul 21, 2023

\

Am I screwed? I have only one spare disk .

How can I reduce the size of the pool without compromising the data?

When it says "DEGRADED" does it really mean its "offline"? Am I just operating off of sda /sdc and sdd or are they all still working ?

Can I just put out the disks that are degraded?

I can put the one spare in a USB enclosure and rebuild the array with 4 disks without data loss?

Please help

nothing · Jul 21, 2023

Would performing a

zpool replace chunk sdb sdf

then a

zpool remove chunk sde

Allow me to recover from degraded state and give me back my healthy RAIDZ1 with 4 disks? Im not sure about the remove as the example shows removing a mirror not a specific drive.

Please help

In usb enclosure:
Disk /dev/sdf: 14.55 TiB, 16000900661248 bytes, 3906469888 sectors
Disk model:
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 33550336 bytes

nothing · Jul 21, 2023

I guess im going to find out, wish me luck

Arwen · Jul 22, 2023

You really should be using the GUI for the replacement, as their are setup details taken care of for you.

Next, you can not remove a column disk from a RAID-Zx. It is not possible, (nor planned), to reduce a pool that has RAID-Zx vDevs.

It would be wise to perform trouble shooting before replacing the drive. You don't list your hardware, (per forum rules), so any troubleshooting we can suggest would be guesses.

Last, using an USB enclosure for a disk drive in an active ZFS pool is not a good solution:

Why you should avoid USB attached drives for data pool disks

This subject has been coming up quite often in the last year or 2. Perhaps because of TrueNAS SCALE has brought forth some attention to TrueNAS in general and ZFS in specific. Please note that this is about USB attached storage for ZFS data...

www.truenas.com

In the short term, probably okay, but the replacement can fail because of the USB enclosure.

nothing · Jul 22, 2023

Arwen said:
You really should be using the GUI for the replacement, as their are setup details taken care of for you.

Next, you can not remove a column disk from a RAID-Zx. It is not possible, (nor planned), to reduce a pool that has RAID-Zx vDevs.

It would be wise to perform trouble shooting before replacing the drive. You don't list your hardware, (per forum rules), so any troubleshooting we can suggest would be guesses.

Last, using an USB enclosure for a disk drive in an active ZFS pool is not a good solution:

Why you should avoid USB attached drives for data pool disks

This subject has been coming up quite often in the last year or 2. Perhaps because of TrueNAS SCALE has brought forth some attention to TrueNAS in general and ZFS in specific. Please note that this is about USB attached storage for ZFS data...

www.truenas.com

In the short term, probably okay, but the replacement can fail because of the USB enclosure.

as you can see above im using the GUI for the first replacement, so I didnt use the command line.

i have no idea how to troubleshoot it :(

So after this replace op finishes - i will have only one degraded drive left.

I HAD to use an external as i only have 5 bays - and they were all taken by a drive... when i tried to physically remove one of the degraded drives the pool would disappear - so i left all the original drives in and added the external for replacement - as soon as it finishes - i was going to swap it with the one it replaced

did I do it wrong?

nothing · Jul 22, 2023

I assumed one could offline a degraded disk and "resilver" onto the remaining disks so long as there was the parity info that could be distributed amongst the pool...

so once this resilvering is done - can I remove the current sdb drive and put sdf into the same physical slot or will the pool act weird?

nothing · Jul 22, 2023

i have no idea what any of this jargon means - no idea what a vdev is

Arwen · Jul 22, 2023

No, you can not "offline" a degraded disk, and resilver on to the remaining disks. ZFS does not allow column remove.

Yes, when the re-silver is complete, you can remove the current SDB and put in the SDF in its place. Should be fine.

A ZFS vDev is a Virtual Device inside a ZFS Pool. You must have at least one data vDev;

Single disk, (which implies no redundancy)
RAID-Zx
Mirror
DRAID

In addition, their are other vDevs for special purposes, (that the average new user does not need). But, these would be in addition to data vDevs as above.

Cache
SLOG
Special / Metadata

In your case, you have a single RAID-Z1 vDev in your "CHUNK" Pool, with 5 disks in that RAID-Z1.

Not sure why your pool disappeared when you removed one of the degraded disks.

Can you put the output of zpool status -v CHUNK in code tags?

nothing · Jul 22, 2023

nothing · Jul 22, 2023

Code:

nothing@truenas:/$ sudo zpool status -v CHUNK
  pool: CHUNK
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 3.45T in 08:02:07 with 0 errors on Sat Jul 22 14:50:45 2023
config:

        NAME                                      STATE     READ WRITE CKSUM
        CHUNK                                     DEGRADED     0     0     0
          raidz1-0                                DEGRADED     0     0     0
            b776a493-8c25-4999-988a-5570ec8afa01  ONLINE       0     0     0
            6415b76b-174e-48ae-9d72-4a0e1dc82ece  ONLINE       0     0     0
            6318b353-b38b-4914-be7a-8c99f90f6063  ONLINE       0     0     0
            287a1658-b6a0-4c11-b486-10973def5afd  DEGRADED     0     0     0  too many errors
            3538601d-ce05-47ff-8465-9cc978c23fbe  ONLINE       0     0     0

errors: No known data errors

nothing · Jul 22, 2023

i dont have another disk but i can put the first replaced disk in the usb enclosure for troubleshooting purposes

nothing · Jul 22, 2023

sorry it didnt disappear but all the disks went into the Add to pool count after i removed the degraded disk (pre replace operation).. but maybe i got that wrong. Sorry im a newbie with truenas and zfs

nothing · Jul 22, 2023

after placing sdf in the slot that sdb was this happened

:(

Davvo · Jul 22, 2023

Introduction to ZFS

This is a short introduction to ZFS. It is really only intended to convey the bare minimum knowledge needed to start diving into ZFS and is in no way meant to cut Michael W. Lucas' and Allan Jude's book income. It is a bit of a spiritual...

www.truenas.com

You should really stop doing things as you feel it and ask of you don't know.

nothing · Jul 22, 2023

Davvo said:
Introduction to ZFS

This is a short introduction to ZFS. It is really only intended to convey the bare minimum knowledge needed to start diving into ZFS and is in no way meant to cut Michael W. Lucas' and Allan Jude's book income. It is a bit of a spiritual...

www.truenas.com

You should really stop doing things as you feel it and ask of you don't know.

I did ask but it took a while to get a reply.

Anyway. I read a lot about zfs before posting which is how I found the zpool commands.

I setup the system just fine not knowing what I was doing and I guess that gave me confidence to resolve. :D also I have several years of Linux experience so there's a bit of pride at play too I want to solve ASAP. It's my modus operandi

The last action I took was ok'd above by Arwen.

nothing · Jul 22, 2023

If I put the replaced drive (former sdb) into the enclosure and the new drive (former sdf) in the former sdb's slot.. it becames UNAVAIL somehow. :(

I've put everything back to the way it was new drive back in enclosure and degraded replaced drive sdb back into it's slot and now everything is there again. So odd

Current state:

nothing · Jul 22, 2023

Looks like just putting the newly replacer drive into the replaced drives "bay" might not always work

I guess this is why it was marked "UNAVAIL" when i swapped them after replace/re-silver op

nothing · Jul 22, 2023

Running a scrub now to clean up the status of the ZFS as its red. However I still have one degraded disk and one replaced disk i need to sort out..

1) how can I get sdf (usb enclosure) into sdb's bay (already replaced and shows up in ADD TO POOL section) without breaking the CHUNK pool?

2) How can I troubleshoot the degraded disk sdc

nothing · Jul 22, 2023

All of the syslog entries for SDC - am I panicking for no reason?

I do see some errors :(

Can I simply zpool clear CHUNK sdc ?

Code:

nothing@truenas:/var/log$ sudo grep sdc syslog |grep -i err
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002023832 op 0x0:(READ) flags 0x700 phys_seg 2 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002060968 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 17319470272 op 0x0:(READ) flags 0x700 phys_seg 32 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 17319472512 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002023904 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 14037996648 op 0x0:(READ) flags 0x700 phys_seg 32 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 27975287360 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 28002089952 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Jul 21 10:22:30 truenas kernel: blk_update_request: I/O error, dev sdc, sector 17319472320 op 0x0:(READ) flags 0x700 phys_seg 3 prio class 0

Etorix · Jul 22, 2023

nothing said:
i have no idea how to troubleshoot it :(

Let's start with a description of your setup…
We're especially interested in knowing if this is bare metal or virtualised (danger ahead!) and to what controller the drives are attached. Also the model of the drives, but we'll know it from the next step, namely the output of smartctl -a /dev/sdX, for each drive (and nicely presented within CODE tags for readability, please).

nothing said:
I setup the system just fine not knowing what I was doing and I guess that gave me confidence to resolve. :D also I have several years of Linux experience so there's a bit of pride at play too I want to solve ASAP. It's my modus operandi

"Hold my beer!" LTT style? This is a great, proven, way of losing data.
If the content of this pool is of any relevance to you, begin by making a backup before attempting any change which may not be reversed.

Important Announcement for the TrueNAS Community.

SOLVED Degraded disk help

Explorer

Explorer

Explorer

MVP

Explorer

Explorer

Explorer

MVP

Explorer

Explorer

Explorer

Explorer

Explorer

MVP

Explorer

Explorer

Explorer

Explorer

Explorer

Wizard

Similar threads