RIADZ1 Replace disks/upgrade

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
Hi,
I am running TrueNAS 12 on a Mini ATX case PC with an Intel 5 series/3400 chipset SATA controller and a 4 disk RAIDZ1 pool. The disks SATA 5400RPM 2TB and are mounted internally not in hot-swap trays.

I am showing a disk with uncorrectable errors and the pool is also running low on space and a combination of these two factors is making it the pool Critical state.

I have just acquired new disks - Seagate Exos 8TB SATA 7200RPM and wish to replace these disks while at the same time extending the pool to use the additional space.

It is my belief that the chipset supports hot-plugging SATA and therefore I should be able to carry out each disk replacement with a larger disk without shutting down.

Can anyone advise what the process should be to:

1) Identify and remove one disk at a time from the pool beginning with the failing disk
2) Physically remove the disk - SATA plug or power plug first?
3) Install the new disk? Again which order of cable plugging?
4) Extend the volume to take up the new free space available

Any help appreciated.
 

awasb

Patron
Joined
Jan 11, 2021
Messages
415
It is my _firm_ belief, that you won't gain anything by not shutting down to exchange the drives (one by one). I would rather guess the opposite and would urge you to ditch the plan.

Once all higher capacity drives are "in" the pool will automatically grow (if you didn't change the defaults, that is).
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
If they aren't in hot-swap trays, seriously, don't try this. As for the rest:
1) Identify and remove one disk at a time from the pool beginning with the failing disk
The Storage -> Disks page will show the serials of each disk:
1643841237550.png

So if ada0 is the one showing errors, first offline ada0, then go to this page and note its serial number. Power down the server, find the disk with the corresponding serial number, remove it, and replace with its replacement. Boot the server, and replace the disk through the GUI. All as described in the docs:

4) Extend the volume to take up the new free space available
Nothing to do here; once you replace the last disk the pool will automatically grow to use the available space.
 

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
Not sure hot swap trays matter here - they're just for convenience. Its the SATA chipset itself that makes things hot pluggable. I am considering changing case to one that has hot swap trays but thats for the future.

As it stands currently my pool is clean since clearing the errors and has not repeated any further uncorrectable errors - these were on ata3.

1643947322767.png


I have 6 SATA slots, 4 are in use for the disks above, one is in use for an optical drive which I could unplug. The main issue though is there are no free drive bays. I have 4 current disks and 4 new ones so I can immediately rule out adding in 4 new disks and removing the old disks.

I believe given the above my only available course of action is to remove one disk at a time starting with ata3 and replace it with a new larger disk and replace each in turn once the pool has rebuilt (resilvered I believe is the term for TrueNAS). My only concern is during the process this will leave me very vulnerable to a hard drive failure as I will have no redundancy until the rebuild completes on each disk.

I guess I could also plug two new drives in and have them sitting cabled up but loose until rebuild completes and then remove two old disks from the pool, shut down, remove the old disks and then mount the new disks then repeat the process for the remaining two new disks. It's messy though and requires a shutdown. This pool is actually providing datastores for my vsphere cluster so I want to ideally try to avoid that.

So yeah, I think it will be the risky one-drive at a time. I don't think I will have any issues with hot-plugging. The only risk is if a drive fails during this process.

I think it might be necessary to upgrade to a motherboard with more SATA slots and a larger case with hot swappable drives to avoid this kind of messing around in future. It's all about $$$$ sadly. I have another SATA card but this mini-ITX motherboard only has one PCIe slot and that is already in use for a dual port intel gigabit network card so that rules out that idea.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Why do you want to replace the disks without shutting down the machine?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The main issue though is there are no free drive bays.
That doesn't really matter for temporary use; you can leave a disk hanging outside the chassis for a bit if needed. Kind of messy, but it works.
Not sure hot swap trays matter here - they're just for convenience.
...and (in association with the connectors used) making sure the relevant electrical connections are made/unmade in the appropriate order.
 

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
Then it really shouldn't be using RAIDZ in the first place.
I never said it was production, its a lab environment and its hosting low priority vms, replicas and backups. But I'd still prefer to avoid the headche of shutting it all down if I can avoid it :)
 

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
That doesn't really matter for temporary use; you can leave a disk hanging outside the chassis for a bit if needed. Kind of messy, but it works.

...and (in association with the connectors used) making sure the relevant electrical connections are made/unmade in the appropriate order.
Not at all, the connections made when inserting a disk tray into a hot-swap are made at the same time, they are in no order.

According to the SATA specification that I have read today there is no order for cabling, it's designed to work with power first or data first.

http://www.lttconn.com/res/lttconn/pdres/201005/20100521170123066.pdf sections 4.1.60 and 7.2.5.1

Of course this is just the SATA spec, it doesn't necessarily mean that TrueNAS itself handles it without spitting the dummy...
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
they are in no order.
...except for the order established by, say, the length of the pins.

Hey, it's your funeral. But don't say we didn't warn you.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
If you have spare SATA ports, it's better to plug new drives first, "replace" old drives, let the pool resilver and only then remove the old drives. Repeat as many times as necessary—and replace as many drives at a time as is convenient (all four in one go is possible!).

But, regardless of the path, you won't complete the process without shutting down at some point. Unless you intend to screw the new drives in place while they are powered and spinning… o_O
 

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
...except for the order established by, say, the length of the pins.

Hey, it's your funeral. But don't say we didn't warn you.
read the spec, it makes it very clear order does not matter. SATA is designed to handle it.
If you have spare SATA ports, it's better to plug new drives first, "replace" old drives, let the pool resilver and only then remove the old drives. Repeat as many times as necessary—and replace as many drives at a time as is convenient (all four in one go is possible!).

But, regardless of the path, you won't complete the process without shutting down at some point. Unless you intend to screw the new drives in place while they are powered and spinning… o_O
Since when does sliding a tray into a case and then plugging cables cause things to explode? The drives are in trays just they are internal in the case not external and hot swappable with a backplane.

The way I see it the worst case here is that I offline a drive that is failing anyway making my pool degraded due to missing a disk, unplug it, plug in the new drive and the new drive either does not detect or cannot be added in to the pool.

The worst case scenario I can see is that whilst in a degraded state and rebuilding on the replacement drive another drive fails by pure bad luck.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Worst case: You'll short something, drive or motherboard.
But go ahead as you want. These are your drives, your system and your data, after all. The process will work—until it doesn't.
 

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
If you have spare SATA ports, it's better to plug new drives first, "replace" old drives, let the pool resilver and only then remove the old drives. Repeat as many times as necessary—and replace as many drives at a time as is convenient (all four in one go is possible!).

But, regardless of the path, you won't complete the process without shutting down at some point. Unless you intend to screw the new drives in place while they are powered and spinning… o_O
I don't have enough SATA ports for this nor enough drive bays. It's got 6 SATA ports in a typical small NAS micro-itx case with 4 3.5" in trays and two 5.25" bays.

Maybe if I remove the CDROM (I don't really need it anyway, it was last used when I first installed FreeNAS 11) and screw two drives into the bays using adapters I could add in two new drives, drop two including the failing ata3 drive and then once that rebuild has completed to include the new disks add the remaining two in somehow moving the old disks into the 5.25" bays since I want the nice trays for the new disks. I could then remove the remaining two old disks one at a time and create a new pool with them to be used for some less important forms of storage given the drives are small and getting old.
 

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
Worst case: You'll short something, drive or motherboard.
But go ahead as you want. These are your drives, your system and your data, after all. The process will work—until it doesn't.
How the hell are you possibly going to SHORT something by plugging it in as per the SATA design? A short implies two pins connected across that should not be ie a bent pin - not possible with SATA. Not unless you spray WATER into the connector first but last time I checked I've been doing IT since the early 90s including datacentre work on multi-million dollar systems and have never had any issues involving shorting something out.

I'm truly surprised at some of the answers I am getting on here, they seem less based on fact and more based on fear.

The reason I asked the question in the first place was a lack of familiarity with the process with FreeNAS/TrueNAS having not had to replace a drive on it. I'd read the manual but to me it didn't seem to make it absolutely clear that it would extend the data on the new disks and having done similar disk replacements on RAID arrays with larger drives where I had ended up with a same size array and a lot of wasted disk space.

Thankfully since posting this I did some good research and have found no documented evidence to base all these fears upon, so please if you have something factual to show there is genuine concern and risk please share it but keep the "it's your funeral (but I won't actually show anything factual to explain WHY)" comments to yourselves please.

Like I said, the worst case scenario is I remove an already beginning to fail device and put my pool into a degraded state and have to either re-add the disk after a reboot or reboot with the new disk in to make it detect properly. I honestly don't think this will happen though, I think it will hot-unplug and remove the disk then I will plug the new one in, the SATA will hot-plug it and TrueNAS will see it and I add it into the pool and off it goes resilvering.

Only one way to know for sure though, do I want to live forever? :wink:
 

anfieldroad

Dabbler
Joined
Dec 21, 2018
Messages
32
Update:

So in the end I decided to shut down and migrate to a nice new Silverstone 8 bay NAS Midi Tower case and a new PSU. I put the original disks in along with two of the new ones (only 6 sata ports remember so I can't use all 8 bays) and everything powered up fine. Though nothing special ie no activity lights on each tray they made it so much easier to do the migration.

Only I can't add in two of the new disks as it gives me two warnings:
"This type of VDEV requires at least 3 disks"
and
"WARNING: Adding data vdevs with different numbers of disks is not recommended. First vdev has 4 disks, new vdev has 2"

Learning experience ;)

So it is pretty clear that we can't just add the new disks in to the existing RAIDZ1 pool.

After scrubbing the pool beforehand I then had to go down the slow and tedious route of using "Replace" to swap one disk at a time to migrate over to the new disks but apart from the time to move to the new NAS case I didn't have to shut down at all. I noted the bays each drive was installed into and their serial numbers and then knew exactly which drive tray to pull out once I had done the "Replace". Approximately 5 hours per disk by the way, not too bad.
 
Top