Help needed : Shrink pool & remove HDD

EAOP

Dabbler
Joined
Mar 29, 2022
Messages
11
Hi,

I kindly ask for some advice...

My current set-up:

Server#1 - "Main" (24/7)
Storage : 8 (out of 10 slots) x 4 TB HDD
1 pool - "main"
16.88 TiB (69%) Used | 7.49 TiB Free


Server#2 - "Backup" (Turned on once per week, to receive replications snapshots from "Main")
Storage : 8 (out of 10 slots) x 4 TB HDD
1 pool - "back up"


I'm just about to buy 8 x 8 TB drives, and was planning to put 4 of these in each server, as a new pool.

At the same time I had a faulty HDD in the "back up" server, and now it would seem another one of the HDD:s had gone bad, as the entire pool is now offline...
I've tried rebooting, but it remains offline.

Questions....

1) "Main" - Is it possible to shrink the pool "main" to just 6 HDD (I might have to delete roughly 1 TB to fit the remaining data on this smaller pool)?
2) "Main" - I could then add the 4 new 8 TB HDD as a new pool, and then I'm finished.
3) "Backup" - Is there anyway to find out what drives (serial number) that are failing via TrueNAS? The mobo is a "Supermicro X11SSL-F" so I can use IPMI too.

Any general advice or "best practice" as how to act in my situation?

Thanks!
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Fully describe your hardware and pool setup.
You can remove vdevs if the pool is all mirrors. If, as expected for bulk storage, this is a 8-wide single raidz# vdev you're stuck with 8-wide forever. You can make a new pool and transfer data, or keep the pool and progressively replace all drives by larger ones to end up with 8*8 TB.

For "Backup", let's begin with the output of zpool status (please use CODE tag).
CODE_tag.png
 

EAOP

Dabbler
Joined
Mar 29, 2022
Messages
11
Thanks for helping out!

Hardware (same for server #1 & #2)
MoBO : Supermicro X11SSL-F Intel Socket Sockel LGA1151 H4 Server Board Mainboard DDR4
CPU : Intel Core i3-6300T 3.3 GHz LGA 1151 Sockel Processor
RAM : 4x8GB DDR4 ECC RAM UDIMM 2400MHz ("for HP ProLiant ML10 Gen9 ML-Systems")
Boot : 2 x 250 Gb SSD, Boot-pool (Raid 1)
SATA/SAS connectivity : LSI 9207-8i PCIE3.0 6Gbps HBA FW:P20 IT Mode
HDD Cage : Two that hold five 3.5" HDDs each.
HDD : 8 x WD Red 4TB / 256MB Cache / 5400 RPM (WD40EFAX)
PSU : Seasonic Focus GX 650W
Chassi fans, front mounted : 2 x 120 mm PWM, pulling air in and dragging air through the HDD:s.
Chassi fans, rear mounted : 2 x 80 mm, sucking air out of the chassi.


All eight 4 TB HDDS are part of the same pool ("main" for server #1, "backup" for server #2) in RAIDZ1.
"You can remove vdevs if the pool is all mirrors" - I don't quite follow - how can I tell if this is the case?

In an ideal world I would have wanted the option to choose a HDD or two, and then remove them from their current pool, and create a new one from them...


Zpool status;

1676213258158.png


My guess is thats it's not a godd idea that I can't dee anything about the "back up" pool - almost at is the LSI is all gone...
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
"You can remove vdevs if the pool is all mirrors" - I don't quite follow - how can I tell if this is the case?
You answered that by saying both pools are RAID-Z1, so no, neither pool is Mirrored. Thus, you can't remove any disks.

Further, you appear to be using WD Red SMR drives which are known to cause problems with ZFS;
HDD : 8 x WD Red 4TB / 256MB Cache / 5400 RPM (WD40EFAX)
Try and make sure any replacements you buy are not SMR, (for example, at present Seagate IronWolf and WD Red Plus are CMR, not SMR).

Since zpool status did not show any results, please run zpool import and paste the results in code tags here.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
My guess is thats it's not a godd idea that I can't dee anything about the "back up" pool - almost at is the LSI is all gone...
This may be an indication that you need to check if the LSI HBA in "Backup" is well seated, well cooled and still working.

As for the pool in "Main" you're stuck with 8-wide raidz1, which is not a very secure geometry. The best option would be to make an 8-wide raidz2 with the new 8 TB and move your data there. (You may replace the drives to end up with an 8-wide raidz1 but that would mostly be asking to increase the amount of data waiting to be lost with the next hardware issue, and eight resilvers with SMR drives is going to be long and painful.)
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Try and make sure any replacements you buy are not SMR, (for example, at present Seagate IronWolf and WD Red Plus are CMR, not SMR).
Often data center drives are cheaper than NAS ones. So you may want to check Seagate Exos and their equivalent from WD and Toshiba. My personal experience with Exos X16 16 TB has not been great in terms of dead drives. On the other hand they were the cheapest, I have a good backup, and the RMA was always very smooth.
 

EAOP

Dabbler
Joined
Mar 29, 2022
Messages
11
Thanks for all advice.
Actually I was aware of the whole SMR vs CMR issue, but somehow I must have messed up when ordering replacement drives... most of them should be (old) CMR drives. I will take closer care in the future...

zpool import did give me more info;
1676228605093.png
 

EAOP

Dabbler
Joined
Mar 29, 2022
Messages
11
So... at the moment I have

Server #1, 8 x 4 TB raidz1 - pool "Main" (~24 Tb)
Server #2, 8 x 4 TB raidz1 - pool "Backup" (~24 Tb)

As least one HDD is faulty on server #2 (as seen above), so I went ahead and bought (this evening) 8 x 8TB WD Datacenter HC320 (7200 rpm, but the price was good).


So, wise people, how would you advice me to rebuild my servers?

Option 1
Server #1, 6 x 4 TB raidz2 - pool "Main_1" (~14 Tb ?)
Server #1, 4 x 8 TB raidz2 - pool "Main_2" (~14 Tb ?)

Server #2, 6 x 4 TB raidz2 - pool "Backup_1" (~14 Tb ?)
Server #2, 4 x 8 TB raidz2 - pool "Backup_2" (~14 Tb ?)

So that leaves me with 3 spare 4 TB HDD and 1 faulty 4 TB HDD.

The main (?) drawback with this set up is that I'm using all HDD:s slots, and I can't really grow my 8 TB pool.


Option 2
Is this perhaps more advisable..?

Server #1, 4 x 4 TB raidz2 - pool "Main_1" (~7 Tb ?)
Server #1, 4 x 8 TB raidz2 - pool "Main_2" (~14 Tb ?)

Server #2, 4 x 4 TB raidz2 - pool "Backup_1" (~7 Tb ?)
Server #2, 4 x 8 TB raidz2 - pool "Backup_2" (~14 Tb ?)


So that leaves me with 7 spare 4 TB HDD and 1 faulty 4 TB HDD.

That leaves me with plenty of 4 TB spares, a raidz2-only-setup, and two free HDD slots per server for future 8TB disks when I need more TB.
Only draw back is 3 Tb less storage per server than today, or 7 TB less than option #1

Or is there a third, even better suggestion?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Sorry, I can't advise further... perhaps someone else can.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Server #1, 8 x 4 TB raidz1 - pool "Main" (~24 Tb)
I'd count 8*4TB in Z1 as (8-1)*4 = 28 TB raw (not to be filled more than 75-80%)

So, wise people, how would you advice me to rebuild my servers?
There are many options… but all the good ones involve getting rid of the SMR disks because they are a liability with ZFS.

Option 1
Server #1, 6 x 4 TB raidz2 - pool "Main_1" (~14 Tb ?)
Server #1, 4 x 8 TB raidz2 - pool "Main_2" (~14 Tb ?)
As a note, you could have several vdevs in the same pool (4*4+2*8 = 32 TB raw), but that would only put the whole at risk until all SMR drives are replaced. With two vdevs, the best option would be 5+5, but that leaves no free slot for safer drive replacement. One vdev with 8-9 drives and 1-2 free slots is better.
The main (?) drawback with this set up is that I'm using all HDD:s slots, and I can't really grow my 8 TB pool.
There are two ways to grow a raidz# pool:
1/ Add further vdevs, which requires the slots, cables/backplane and power.
2/ Replace all drives in a vdev by larger ones.

The natural scenario is an 8-wide raidz2 vdev (48 TB raw), which would then grow by replacing all disks some years from now.

Option 2
Is this perhaps more advisable..?



That leaves me with plenty of 4 TB spares, a raidz2-only-setup, and two free HDD slots per server for future 8TB disks when I need more TB.
Only draw back is 3 Tb less storage per server than today, or 7 TB less than option #1
I do not see how an "upgrade" to less storage is more advisable.
Two slots are not useful to add space to a raidz2 pool: You'd end up with an unbalanced mix of raidz2 and a mirror, which could only be sanitised by destroying the whole pool. One or two free slots are only useful to replace drives.

Or is there a third, even better suggestion?
Main suggestion: Get even more drives to replace all 4 TB SMR drives. Drawback: $$$
Secondary suggestion: Use the valid SMR drives as backup in RAID6 (OMV, or other solution), but not with ZFS. Drawback: Incremental backups have to use rsync rather than ZFS replication.
 
Last edited:

EAOP

Dabbler
Joined
Mar 29, 2022
Messages
11
Thanks for your advice, and for helping a newcomer in a kind and supporting tone.
 

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
To the OP: I don't see that you mentioned how much data is being stored. It is unwise to fill a ZFS file system beyond 80% of capacity. Performance begins to suffer and reliability goes down.

If you are going to go through the effort to rebuild file systems you should:
  1. Make certain you have enough capacity for your intended use (and allow for growth over the next couple of years.)
  2. Use RAIDZ2 with large capacity drives
 
Top