Pool Created from CLI, RMA replaced via GUI, extra 2G Linux swap created on replacement drive, the larger partition is smaller than other pool members

Sgt_Bizkit · Apr 29, 2023

Hello All, (my first post)

Version:
TrueNAS-SCALE-22.02.4
Running within VMWARE with direct physical access to the disks
(This is done so i can dual boot directly into truenas scale if i want and it continue to function normally as a backup OS)

I created a pool via CLI using (due to budget constraints)
3 x 18tb Disks
2 x fake 18tb disks (offlined)
to create a degraded RaidZ2

I later added extra 2x 18tb Disks replacing the offlined fake disks via CLI and all was well.

5 Months later 1 disk encountered uncorrectable sectors, so i offlined and used the GUI to replace the disk. (hindsight i should have done CLI again)

I didnt notice any immediate issue, and thought i'd check the drive stats using "fdisk --List" and saw the replacement had a 2G swap partition created, whereas the others did not.

It started the resilvering process without issue

This means the larger partition would be smaller than the other pool member partitions, so i imagine the resilver will fail near 100% (currently at 81% - 6 hours to go).
However i'm not 100% sure as some people state the swap is created in the event other disks vary in size slighly per manufacturer and its advisable to keep.
others state its used for swapping out RAM or backward compatibility with core (lot of interpretations)

My question:
Should i disable swap using "midclt call system.advanced.update '{"swapondrive": 0}'" then offline, delete the partitions of the RMA drive and let truenas resilver it?
or see if the Resilver takes with the 2G partition?

The RMA drive is Disk /dev/sdc (disk 2 in windows screenshot)


root@truenas[~]# fdisk --list
Disk /dev/sdd: 16.37 TiB, 18000207937536 bytes, 35156656128 sectors
Disk model: VMware Virtual S
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E903C3E2-0EC7-D140-BF00-207392D0F46B

Device           Start         End     Sectors  Size Type
/dev/sdd1         2048 35156637695 35156635648 16.4T Solaris /usr & Apple ZFS
/dev/sdd9  35156637696 35156654079       16384    8M Solaris reserved 1


[B]Disk /dev/sdc:[/B] 16.37 TiB, 18000207937536 bytes, 35156656128 sectors
Disk model: VMware Virtual S
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: D6F9D8A4-5383-4C84-AA23-C2147ADE4674

Device       Start         End     Sectors  Size Type
[B]/dev/sdc1      128     4194304     4194177    2G Linux swap[/B]
/dev/sdc2  4194432 35156656094 35152461663 16.4T Solaris /usr & Apple ZFS


Disk /dev/sdb: 16.37 TiB, 18000207937536 bytes, 35156656128 sectors
Disk model: VMware Virtual S
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: A87EAFD6-7424-A940-BB1A-8B1CC3EA0685

Device           Start         End     Sectors  Size Type
/dev/sdb1         2048 35156637695 35156635648 16.4T Solaris /usr & Apple ZFS
/dev/sdb9  35156637696 35156654079       16384    8M Solaris reserved 1


Disk /dev/sdf: 16.37 TiB, 18000207937536 bytes, 35156656128 sectors
Disk model: VMware Virtual S
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: EA6E172A-78E0-5B41-BB22-A1967F989676

Device           Start         End     Sectors  Size Type
/dev/sdf1         2048 35156637695 35156635648 16.4T Solaris /usr & Apple ZFS
/dev/sdf9  35156637696 35156654079       16384    8M Solaris reserved 1


Disk /dev/sda: 120 GiB, 128849018880 bytes, 251658240 sectors
Disk model: VMware Virtual S
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: B154E109-E16F-4562-AD84-875DDE6BA813

Device        Start       End   Sectors   Size Type
/dev/sda1      4096      6143      2048     1M BIOS boot
/dev/sda2      6144   1054719   1048576   512M EFI System
/dev/sda3  34609152 251658206 217049055 103.5G Solaris /usr & Apple ZFS
/dev/sda4   1054720  34609151  33554432    16G Linux swap

Partition table entries are not in disk order.


Disk /dev/sde: 16.37 TiB, 18000207937536 bytes, 35156656128 sectors
Disk model: VMware Virtual S
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: CE1CB627-ADA3-9744-BAB6-AD59F74C855D

Device           Start         End     Sectors  Size Type
/dev/sde1         2048 35156637695 35156635648 16.4T Solaris /usr & Apple ZFS
/dev/sde9  35156637696 35156654079       16384    8M Solaris reserved 1


Disk /dev/mapper/sda4: 16 GiB, 17179869184 bytes, 33554432 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
root@truenas[~]#

Checking my old notes i used these commands when creating the original pool:

truncate -s 18000207937536 /tmp/FD1.img
 truncate -s 18000207937536 /tmp/FD2.img

 zpool create StoragePool -o ashift=12 -f raidz2 /dev/sdd /dev/sdc /dev/sdb /tmp/FD1.img /tmp/FD2.img 

 zpool offline StoragePool /tmp/FD1.img
 zpool offline StoragePool /tmp/FD2.img

root@truenas[~]# zpool status StoragePool 
  pool: StoragePool 
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
config:

        NAME              STATE     READ WRITE CKSUM
       StoragePool     DEGRADED     0     0     0
          raidz2-0        DEGRADED     0     0     0
            sdf           ONLINE       0     0     0
            sde           ONLINE       0     0     0
            sdd           ONLINE       0     0     0
            /tmp/FD1.img  OFFLINE      0     0     0
            /tmp/FD2.img  OFFLINE      0     0     0

[COLOR=rgb(20, 20, 20)]zpool replace StoragePool -f /tmp/FD2.img /dev/sdb[/COLOR]
zpool online StoragePool /dev/sdb

zpool replace StoragePool -f /tmp/FD1.img /dev/sdc
zpool online StoragePool /dev/sdc

Thank you for taking the time to read & reply

danb35 · Apr 30, 2023

Sgt_Bizkit said:
This means the larger partition would be smaller than the other pool member partitions, so i imagine the resilver will fail near 100% (currently at 81% - 6 hours to go).

I doubt the former is the case, and I think it's highly unlikely the latter is the case--if the remaining partition were too small, the replace operation should have failed immediately.

Sgt_Bizkit · Apr 30, 2023

Hi Danb35,

I waited for it to complete the resilver and it did complete

however it appears to be starting over again, but not showing exactly as before in the CLI (not showing which disk it is resilvering, but i can see its the RMA drive being written too again.)

Any suggestions on how to proceed?

#########################################

Code:

root@truenas[~]# zpool status StoragePool
  pool: StoragePool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Apr 29 04:02:20 2023
        64.7T scanned at 546M/s, 64.6T issued at 546M/s, 64.7T total
        12.9T resilvered, 99.98% done, 00:00:21 to go
config:


        NAME                                      STATE     READ WRITE CKSUM
        StoragePool                               ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            3cfbdc2e-fd63-4c48-8ed7-29726666cbf3  ONLINE       0     0     0
            c02af95d-028a-4324-a95e-5d1a32c7259b  ONLINE       0     0     0  (resilvering)
            sdb                                   ONLINE       0     0     0
            b8bb1e1f-c388-8648-a7f0-6da6b08b169e  ONLINE       0     0     0
            352b02bc-488d-cc41-8cbf-32995ea61f72  ONLINE       0     0     0

errors: No known data errors


root@truenas[~]# zpool status StoragePool
  pool: StoragePool
 state: ONLINE
  scan: resilvered 12.9T in 1 days 10:36:25 with 0 errors on Sun Apr 30 14:38:45 2023
config:


        NAME                                      STATE     READ WRITE CKSUM
        StoragePool                               ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            3cfbdc2e-fd63-4c48-8ed7-29726666cbf3  ONLINE       0     0     0
            c02af95d-028a-4324-a95e-5d1a32c7259b  ONLINE       0     0     0
            sdb                                   ONLINE       0     0     0
            b8bb1e1f-c388-8648-a7f0-6da6b08b169e  ONLINE       0     0     0
            352b02bc-488d-cc41-8cbf-32995ea61f72  ONLINE       0     0     0


errors: No known data errors


root@truenas[~]# zpool status StoragePool
  pool: StoragePool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Apr 30 14:38:50 2023
        871G scanned at 12.8G/s, 641M issued at 9.42M/s, 64.7T total
        0B resilvered, 0.00% done, no estimated completion time
config:


        NAME                                      STATE     READ WRITE CKSUM
        StoragePool                               ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            3cfbdc2e-fd63-4c48-8ed7-29726666cbf3  ONLINE       0     0     0
            c02af95d-028a-4324-a95e-5d1a32c7259b  ONLINE       0     0     0
            sdb                                   ONLINE       0     0     0
            b8bb1e1f-c388-8648-a7f0-6da6b08b169e  ONLINE       0     0     0
            352b02bc-488d-cc41-8cbf-32995ea61f72  ONLINE       0     0     0


errors: No known data errors

Sgt_Bizkit · May 3, 2023

For those wondering the outcome of my endeavour,

I used the command midclt call system.advanced.update '{"swapondrive": 0}' to change the swap size to 0GB on new disk members.

shut down the VM, re-instorduced the disk as "new" hardware (direct physical access), used the "new disk" to replace the old reference via gui.

this time it didnt show the 2Gb swap at the start of the partition

Disk /dev/sdc: 16.37 TiB, 18000207937536 bytes, 35156656128 sectors
Disk model: VMware Virtual S
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: CDE4CCD6-9595-4FD8-A80F-7DF4BD7894F7

Device     Start         End     Sectors  Size Type
/dev/sdc1     40 35156656094 35156656055 16.4T Solaris /usr & Apple ZFS

It ran the resilver once, the GUI however didnt show the % or time to finish ETA correctly during this run.

after it completed, it ran the resilver again however the GUI this time showed all the figures and ETA correctly.

After being very patient it completed successfully on thhe 2nd run and has not restarted the resilver.

root@truenas[~]# zpool status -xv
  pool: StoragePool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue May  2 01:44:51 2023
        64.7T scanned at 547M/s, 64.6T issued at 547M/s, 64.7T total
        12.9T resilvered, 99.98% done, 00:00:26 to go
config:

        NAME                                      STATE     READ WRITE CKSUM
        StoragePool                               ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            3cfbdc2e-fd63-4c48-8ed7-29726666cbf3  ONLINE       0     0     0
            85e97ab0-ceca-4c32-b1c3-ef36570c58b5  ONLINE       0     0     0  (resilvering)
            sdb                                   ONLINE       0     0     0
            b8bb1e1f-c388-8648-a7f0-6da6b08b169e  ONLINE       0     0     0
            352b02bc-488d-cc41-8cbf-32995ea61f72  ONLINE       0     0     0



root@truenas[~]# zpool status StoragePool
  pool: StoragePool
 state: ONLINE
  scan: resilvered 12.9T in 1 days 10:24:46 with 0 errors on Wed May  3 12:09:37 2023
config:

        NAME                                      STATE     READ WRITE CKSUM
        StoragePool                               ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            3cfbdc2e-fd63-4c48-8ed7-29726666cbf3  ONLINE       0     0     0
            85e97ab0-ceca-4c32-b1c3-ef36570c58b5  ONLINE       0     0     0
            sdb                                   ONLINE       0     0     0
            b8bb1e1f-c388-8648-a7f0-6da6b08b169e  ONLINE       0     0     0
            352b02bc-488d-cc41-8cbf-32995ea61f72  ONLINE       0     0     0

errors: No known data errors

closing remarks:
I recall the GUI having the same issues when using the 2GB partition and when it restarted the resilver again i saw the GUI updated correctly.
so it may have completed successfully even with the swap partition on the second run (however this was not identical to the partitions of the drive it was replacing). Here is a thread of similar issues with replacement drives not being replaced identically in partitions causing issues https://www.truenas.com/community/t...g-error-that-is-not-warned-over-anyway.93735/

Where i got the disable swap command from:

Reddit - The heart of the internet

www.reddit.com

I believe the default setting is: midclt call system.advanced.update '{"swapondrive": 2}'
(i read it in a comment somewhere, results in 2GB swap)

3 full resilvers (2 borked gui (with and without swap), and 1 correct gui (after full resilver with no swap)

Very stressful but I think i am good.

Important Announcement for the TrueNAS Community.

Pool Created from CLI, RMA replaced via GUI, extra 2G Linux swap created on replacement drive, the larger partition is smaller than other pool members

Sgt_Bizkit

Cadet

danb35

Hall of Famer

Sgt_Bizkit

Cadet

Sgt_Bizkit

Cadet

Reddit - The heart of the internet

Similar threads

Important Announcement for the TrueNAS Community.

Pool Created from CLI, RMA replaced via GUI, extra 2G Linux swap created on replacement drive, the larger partition is smaller than other pool members

Sgt_Bizkit

Cadet

danb35

Hall of Famer

Sgt_Bizkit

Cadet

Sgt_Bizkit

Cadet

Reddit - The heart of the internet

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Pool Created from CLI, RMA replaced via GUI, extra 2G Linux swap created on replacement drive, the larger partition is smaller than other pool members"

Similar threads