SOLVED **Scale - Expand to fill new larger disks errors and chasing my tail** *Error: Partition(s) 1 on /dev/sdb have been written, but we have been unable*

Thebokke

Dabbler
Joined
Aug 3, 2018
Messages
10
For the sake of documentation, I ran into this issue in the loop and this was another problem on the last go-a-round.

My solve was resilver back to another disk and then wipe and resilver back to the new disk.
Separate issue perhaps

This isn't my exact error, but similar to the one I got.




chef:~ # zpool replace -f tank 12988155072034477206 /dev/disk/by-id/wwn-0x50014ee2b10a5952-part1
cannot replace 12988155072034477206 with /dev/disk/by-id/wwn-0x50014ee2b10a5952-part1: /dev/disk/by-id/wwn-0x50014ee2b10a5952-part1 is busy, or device removal is in progress
Hi Can you eloborate what you did here please? Did you resilver back to an old disk (old smaller size or one the same size as the newer disks?, and then replace this with the new larger disk again?

I'm getting the same issue, I wipe the drive and try to replace the Offline disk (with the weird number referenced in the GUI under Storage menu, in my case 14938118949713813089 instead of an sdd device name) and I get this error in the GUI:

Error: [EZFS_BADDEV] cannot replace 14938118949713813089 with /dev/disk/by-partuuid/f01c724a-0e4e-4105-a990-64a9ab06967b: /dev/disk/by-partuuid/f01c724a-0e4e-4105-a990-64a9ab06967b is busy, or device removal is in progress

If I online that disk and reboot the pool looks healthy but I have the disk refeenced differently to the rest i.e. sdd1 as per this screenshot below. The pool still hasnt expanded to the new size. How frustrating.

1706970302374.png
 

Thebokke

Dabbler
Joined
Aug 3, 2018
Messages
10
I had the same issue, but after the initial resilvering, all I did was:
Code:
parted /dev/sdX resizepart 1 100%


Then TrueNAS found the new size and expanded the pool.
Hi @Starblazr Did you have to run this command for each drive in your pool, or just the one with the problem?

Was this resilvering you mentioned when the expand button created the issue, or earlier on in the process when sequentially replacing disks with larger ones?

Thanks for your help.
 
Joined
Jan 1, 2023
Messages
16
Just wanted to chime in that I'm having the exact same issues Scale 23.10.0.1. Do I follow @ColeTrain or @danb35 writeups? All original disks have been removed and replaced with larger drives, expand button has broken the pool on 2 occassions (6 disk Z2). Bit of a nightmare, looking for the simplest (and least risky method). Any advice appreciated.
Hey, just got back from vacation and came back to this thread...

Is your status still the same as your next post? One weird drive and the others properly labeled by partuuid?

Here's what I did when my raidz1 pool was in that state (pretty much following @danb35's instructions):
  1. parted into the "broken" drive:
    • Code:
      root@moi-truenas[~]# parted /dev/nvme4n1
  2. Wipe drive and create zfs partition spanning the entire drive (did not create swap because that seems to be a TrueNAS Core thing):
    • Code:
      (parted) mklabel gpt
      (parted) mkpart "" zfs 0% 100%
      (parted) print
      (parted) quit
  3. Check status of drives with lsblk:
    • Code:
      root@moi-truenas[~]# lsblk -o NAME,SIZE,PARTUUID
      NAME            SIZE PARTUUID
      sda            16.4T
      └─sda1         16.4T d32d3b41-6dda-4b1f-9991-3a8aa7498e83
      sdb            16.4T
      └─sdb1         16.4T d1b2a4a1-df0d-4838-b8ac-ce3dab09d792
      sdc            16.4T
      └─sdc1         16.4T 9a3bed09-96e3-4425-89be-135eaee2ab83
      sdd            16.4T
      └─sdd1         16.4T e1c33f96-0b54-4760-8a03-042273327191
      sde            16.4T
      └─sde1         16.4T f9629191-3876-4cf3-8dd4-6ec3b8f54b98
      sdf            16.4T
      └─sdf1         16.4T b3116a23-5af5-435a-8d34-d0eda48ef954
      sdg            16.4T
      └─sdg1         16.4T 629bee9e-87bd-4b9c-af02-859c8405ff11
      sdh            16.4T
      └─sdh1         16.4T 86c70873-83b2-4347-96ff-09976f74e5b4
      zd0              50G
      nvme0n1       110.3G
      └─nvme0n1p1   110.3G 6a1eacb2-4fd5-4b5a-81a3-4eab8dcd1906
      nvme5n1       110.3G
      ├─nvme5n1p1       1M cc0ad83e-e918-4269-ac64-e6bd97e5f4f7
      ├─nvme5n1p2     512M a1118534-fcba-41ee-8a4a-2f76c9f244f3
      ├─nvme5n1p3    93.8G 81143d0c-b3ce-4d83-ab91-f1dddfce215f
      └─nvme5n1p4      16G 15352fea-c651-4da5-aafd-e362346f8dcc
        └─nvme5n1p4    16G
      nvme3n1       110.3G
      └─nvme3n1p1   110.3G 2d5f6510-f20b-42d7-a4fd-1310b9d66804
      nvme4n1         3.6T
      └─nvme4n1p1     3.6T abbd4e79-9621-4d37-b20f-8d938a2c96e4
      nvme1n1         3.6T
      └─nvme1n1p1     3.6T b312fe83-ff0b-461d-81c0-37b60f049c00
      nvme2n1         3.7T
      └─nvme2n1p1     3.7T 930dfe2f-4f49-4770-a3ac-305db7e26acf
  4. Replace drive with new partuuid, for example:
    • Code:
      root@moi-truenas[~]# zpool replace fastpool /dev/nvme4n1 /dev/disk/by-partuuid/abbd4e79-9621-4d37-b20f-8d938a2c96e4
Repeated for each of my three drives. At the end, because zfs autoexpand was on, it automatically showed the pool expanded to the correct size.

I restarted the server before starting this sequence of steps, just to make sure that the parted step in the broken Replace button in the GUI completely finished running.

Keep in mind that I didn't care about losing the data in my pool because it was already backed up elsewhere; so back up your pool if you can. I ended up not losing any data after all, because only one drive was ever out of whack at a time.
 
Last edited:

Thebokke

Dabbler
Joined
Aug 3, 2018
Messages
10
Hey, just got back from vacation and came back to this thread...

Is your status still the same as your next post? One weird drive and the others properly labeled by partuuid?

Here's what I did when my raidz1 pool was in that state (pretty much following @danb35's instructions):
  1. parted into the "broken" drive:
    • Code:
      root@moi-truenas[~]# parted /dev/nvme4n1
  2. Wipe drive and create zfs partition spanning the entire drive (did not create swap because that seems to be a TrueNAS Core thing):
    • Code:
      (parted) mklabel gpt
      (parted) mkpart "" zfs 0% 100%
      (parted) print
      (parted) quit
  3. Check status of drives with lsblk:
    • Code:
      root@moi-truenas[~]# lsblk -o NAME,SIZE,PARTUUID
      NAME            SIZE PARTUUID
      sda            16.4T
      └─sda1         16.4T d32d3b41-6dda-4b1f-9991-3a8aa7498e83
      sdb            16.4T
      └─sdb1         16.4T d1b2a4a1-df0d-4838-b8ac-ce3dab09d792
      sdc            16.4T
      └─sdc1         16.4T 9a3bed09-96e3-4425-89be-135eaee2ab83
      sdd            16.4T
      └─sdd1         16.4T e1c33f96-0b54-4760-8a03-042273327191
      sde            16.4T
      └─sde1         16.4T f9629191-3876-4cf3-8dd4-6ec3b8f54b98
      sdf            16.4T
      └─sdf1         16.4T b3116a23-5af5-435a-8d34-d0eda48ef954
      sdg            16.4T
      └─sdg1         16.4T 629bee9e-87bd-4b9c-af02-859c8405ff11
      sdh            16.4T
      └─sdh1         16.4T 86c70873-83b2-4347-96ff-09976f74e5b4
      zd0              50G
      nvme0n1       110.3G
      └─nvme0n1p1   110.3G 6a1eacb2-4fd5-4b5a-81a3-4eab8dcd1906
      nvme5n1       110.3G
      ├─nvme5n1p1       1M cc0ad83e-e918-4269-ac64-e6bd97e5f4f7
      ├─nvme5n1p2     512M a1118534-fcba-41ee-8a4a-2f76c9f244f3
      ├─nvme5n1p3    93.8G 81143d0c-b3ce-4d83-ab91-f1dddfce215f
      └─nvme5n1p4      16G 15352fea-c651-4da5-aafd-e362346f8dcc
        └─nvme5n1p4    16G
      nvme3n1       110.3G
      └─nvme3n1p1   110.3G 2d5f6510-f20b-42d7-a4fd-1310b9d66804
      nvme4n1         3.6T
      └─nvme4n1p1     3.6T abbd4e79-9621-4d37-b20f-8d938a2c96e4
      nvme1n1         3.6T
      └─nvme1n1p1     3.6T b312fe83-ff0b-461d-81c0-37b60f049c00
      nvme2n1         3.7T
      └─nvme2n1p1     3.7T 930dfe2f-4f49-4770-a3ac-305db7e26acf
  4. Replace drive with new partuuid, for example:
    • Code:
      root@moi-truenas[~]# zpool replace fastpool /dev/nvme4n1 /dev/disk/by-partuuid/abbd4e79-9621-4d37-b20f-8d938a2c96e4
Repeated for each of my three drives. At the end, because zfs autoexpand was on, it automatically showed the pool expanded to the correct size.

I restarted the server before starting this sequence of steps, just to make sure that the parted step in the broken Replace button in the GUI completely finished running.

Keep in mind that I didn't care about losing the data in my pool because it was already backed up elsewhere; so back up your pool if you can. I ended up not losing any data after all, because only one drive was ever out of whack at a time.
Thanks for the reply @jenesuispasbavard, I appreciate it.

I pretty much did the same thing as you but with an additional step to maintain redundancy of my pool., but only I'm half way through. Although most of my data on my raidz2 pool is backed up, the backup data is across a couple of other pools, not a single complete backup and would be a real pain to set up again (hence the reason I was upgrading to larger drives in the first place, with the old smaller drives to go into a backup server).

Needless to say I wanted to keep the redundancy of the pool, so I'm replacing each of the larger drives one-by-one with an older smaller drive. I've then been wiping the replaced larger drive, then partioned it manually as per @danb35 's guide and then replaced the smaller drive again with the big one. Repeat, repeat, repeat - a bit of a time consuming process, but it keeps my pool healthy.

Hoping they fix this error soon, as it's been about 10 solid days of resilvering to get to this point!

Thanks again for the reply.
 

Lil Foams

Cadet
Joined
Feb 10, 2024
Messages
1
Hi @Starblazr Did you have to run this command for each drive in your pool, or just the one with the problem?

Was this resilvering you mentioned when the expand button created the issue, or earlier on in the process when sequentially replacing disks with larger ones?

Thanks for your help.

I literally hunted down the issue discussed in this specific thread for HOURSSSSSS.

I also went out of my way to make an account just to reply to your question, I hope I helped in case you hadn't figured it out yet.

Out of curiosity of your own question I decided to do exactly this. I used 'print all' to find out which of my drives were still reporting the incorrect size and then performed this command for each and every problem drive to correct it. I immediately jumped for joy when I checked my pool and saw that the total capacity was now correct.

Thanks everyone that popped into this thread with valuable information, I had to put the pieces together in my head to make it make sense. (I'm sure your instructions were perfectly clear, but my brain doesn't work that way.)
 

Thebokke

Dabbler
Joined
Aug 3, 2018
Messages
10
I literally hunted down the issue discussed in this specific thread for HOURSSSSSS.

I also went out of my way to make an account just to reply to your question, I hope I helped in case you hadn't figured it out yet.

Out of curiosity of your own question I decided to do exactly this. I used 'print all' to find out which of my drives were still reporting the incorrect size and then performed this command for each and every problem drive to correct it. I immediately jumped for joy when I checked my pool and saw that the total capacity was now correct.

Thanks everyone that popped into this thread with valuable information, I had to put the pieces together in my head to make it make sense. (I'm sure your instructions were perfectly clear, but my brain doesn't work that way.)
Awesome to hear.

I've finally completed replacing each drive with a manually partitioned drive and replaced the disks one by one, my pool is finally showing all available capacity! About 10 days of resilvering all in, what a pain, but no data lost and no need to restore from a backup so all good.
 

flotueur

Dabbler
Joined
Feb 26, 2022
Messages
22
I had the same issue, but after the initial resilvering, all I did was:
Code:
parted /dev/sdX resizepart 1 100%


Then TrueNAS found the new size and expanded the pool.
Thank you, I had the same issue and applied this solution and it worked flowlessly!
Hopefully this bug fix will be released soon!
 

BillCardiff

Explorer
Joined
May 13, 2014
Messages
59
Thank you so much for this. I was losing my mind. 8 drive pool - started with 8x 8Tb and upgraded to 8x 12Tb only to watch it fail to expand over and over.
I found set the last drive to OFFLINE, running LSBLK showed me 3 drives that were stuck at 8Tb, so ran the parted /dev/sd* resizepart 1 100% on each of the three. I used LSBLK between each to ensure they all came back up to 12Tb. Then I ONLINE'd the last drive, it came up as an error so I replaced it with itself (forced). Immediately had the full pool size available, and am watching it resilver right now. I feel confident as running LSBLK now shows 8 drives all at 12Tb with the proper header, block count, and size.
Forgot to mention that I'm running the 24.04 beta, I went to the beta hoping that the bug reported in 23 would be left behind. It was not.
 
Joined
Dec 2, 2019
Messages
30
Coming across this thread and running into the same problem.
When I try to run parted /dev/sd* resizepart 1 100% on any of my drives, I get a error saying "Can't have overlapping partitions."
Any ideas?..
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Examine the partition table (however Linux does that) - there is probably a partition behind the one you want to expand. If it is swap, it can possibly be deleted? But I am not a Linux expert - I could guide you through the process on FreeBSD at 2 am having had 5 beers, sorry. Repeating myself - educated guess: there is another partition in the way.
 
Joined
Dec 2, 2019
Messages
30
Examine the partition table (however Linux does that) - there is probably a partition behind the one you want to expand. If it is swap, it can possibly be deleted? But I am not a Linux expert - I could guide you through the process on FreeBSD at 2 am having had 5 beers, sorry. Repeating myself - educated guess: there is another partition in the way.
There is actually a swap partition after it. Is it safe to just delete that?…
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
iX, why is the swap partition at the end of the drive?

@timvanhelsdingen someone with a deeper knowledge of SCALE will have to step in.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
iX, why is the swap partition at the end of the drive?
Because someone f-ed up royally writing the partitioning code in 23.10.0 (or maybe it was the prior release).
 
Top