Pool Best Practice while Moving Drives

Status
Not open for further replies.

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
I currently have a backup pool made up of a single Z1 vDev with 6@4TB drives (this backs up my production pool that is a 12 drive Z2 made of 2 6 drive vDevs mixed with 3TB and 4TB drives).

I am looking at buying another case that will allow me to use some old 3TB drives that I have. This is where it gets tricky. I would like to take the 6@4TB drives in my backup pool and move them into my production pool and then have a 12 drive backup pool with 2 6 drive vDevs made up of only 3TB disks.

The first issue I run into is that I am pretty sure if I stick a 3TB disk into the backup pool that is current made up of 6@4TB drives, the backup pool will not like that at all given it is about 80% full (probably would never like that given there is no shrink option right?).

So, a move forward path I am considering is (EDIT to update based on great suggestion to RTFM :) ):
  • note: all of the 3TB drives are ST3000DM001 and 4TB drives are ST4000DM000
  • move system to new case and verify system stability for a while :)
  • scrub production pool
  • verify UPS works!!!
  • verify SMART status of all drives, specifically the 3TB drives as the ST3000DM001 model is *cough* crap
  • ensure production pool is configured to auto-expand (it is right now)
  • mark serial numbers on all drive caddies :)
  • destroy backup pool (unzip fly...)
  • create backup pool made up of 2 Z1 vDevs (5@4TB + 1@3TB in vDev1 and 1@4TB + 5@3TB in vDev2)
  • scrub new backup pool and verify stability
  • rsync (or replicate) data from production pool to new backup pool
    • this will leave me in a short window with no backups - about a day or so to copy data IIRC plus testing time
    • I will be hammering away at the production pool - it is stable and happy and is scrubbed bi-weekly
    • even during scrubs, HDD temperatures do not exceed about 37C in my current Norco case.
  • buy 1 additional 4TB drive (I have 7@3TB drives in my production pool) and upgrade 1 3TB drive in production pool
    • replace 3TB drive with new 4TB drive with the REPLACE command
    • wait for resilver to complete
  • swap a 4TB drive with the now available 3TB drive in backup pool
    • replace 4TB drive with 3TB drive with the REPLACE command
    • wait for resilver to complete
  • repeat process until all drives are swapped (this will take a "bit" of time...)
I figure that this should result in a ~27TiB backup tank and a ~29TiB production tank. There will be a small window without a backup of my replaceable data as all of my irreplaceable data is stored elsewhere in addition to my FreeNAS box. Worst case risk of production pool is that it can only accept one failure in the vDev that is resilvering but technically the right 3 drives "could" fail and result in no data loss (but a large amount of pucker factor would ensue which would blow...).

The SuperMicro case (847E16 with SAS2 expander backplanes and 9211-8i) that I am looking at allows for 36 hot swap drive bays and 4 internal 2.5" drive bays. It also has 2 6core Xeon 5645 CPUs and 64GB of Registered ECC ram. So, this proposed new case would still allow me to grow both the backup and production pools with another 6 drives each if/when I need to add space. Only crappy part of the plan is that the 4 internal 2.5" drives are not easy to get to, so they would be an offline swap if required. Not the end of the world as they only run jails (mirrored pool, replicated to production pool) and store the system dataset (mirrored pool, total overkill but I had old 80GB SSDs lying around).

I still plan to boot from mirrored USB keys.

So, is my plan above fraught with dangers or does it make some sense? Is there a better way that I should approach this (without buying additional drives)?

Appreciate any feedback.

Cheers,
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
So, is my plan above fraught with dangers or does it make some sense?
Makes sense to me. Just be sure to follow the directions for replacing a drive. Hint: "offline drive" is not step 1. In fact, it appears you'll have enough bays and drive ports to follow the directions for replacing drives to grow a pool, without losing any redundancy.
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
Makes sense to me. Just be sure to follow the directions for replacing a drive. Hint: "offline drive" is not step 1. In fact, it appears you'll have enough bays and drive ports to follow the directions for replacing drives to grow a pool, without losing any redundancy.
Ahh, I will have to revisit that section. I seem to remember back in FreeNAS 8 that the replace drive did not work. Thank you for the reply!

EDIT 1: There is a whole section in there for growing pools!! RTFM... :)

EDIT 2: I assume that if your system supports hot-swap drives, that you need not shut down the system. Is this correct?

Cheers,
 
Last edited:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Ahh, I will have to revisit that section. I seem to remember back in FreeNAS 8 that the replace drive did not work. Thank you for the reply!

EDIT 1: There is a whole section in there for growing pools!! RTFM... :)

EDIT 2: I assume that if your system supports hot-swap drives, that you need not shut down the system. Is this correct?

Cheers,

You want to make sure you have no swap in use before pulling a drive (swapoff/swapon)

Also smart tests don't like you pulling drives. Remember to reset which disks are being tested.
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
You want to make sure you have no swap in use before pulling a drive (swapoff/swapon)

Also smart tests don't like you pulling drives. Remember to reset which disks are being tested.
I know how to ensure SMART does not run but I am unsure of how to ensure I have no swap in use. Also, if the drive is offlined, then there should be no swap present on the offline drive anyway though right?

Cheers,
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
if your system supports hot-swap drives, that you need not shut down the system
if the drive is offlined, then there should be no swap present on the offline drive anyway though right?
In theory you should be able to remove an offline drive without shutting down on a hot-swap system. In practice, you'd be relying on the OS drivers playing nicely with the underlying system, so shutting down might be safer.
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
In theory you should be able to remove an offline drive without shutting down on a hot-swap system. In practice, you'd be relying on the OS drivers playing nicely with the underlying system, so shutting down might be safer.
Thanks. I have swapped a number of drives on this system while running without any issues (due to drive problems, no pare slot so I would offline then replace and resilver).

That said, I will test on new system before committing. Should work as the LSI 9211-8i cards seem to hot swap like real champs!

Cheers,
 
Last edited:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
In theory you should be able to remove an offline drive without shutting down on a hot-swap system. In practice, you'd be relying on the OS drivers playing nicely with the underlying system, so shutting down might be safer.

Oh. I do it all the time. It's not theory. And I don't offline first.

BUT you need to ensure no swap. Which offlining might do.

I run this script every hour which helps:
https://forums.freenas.org/index.ph...ny-used-swap-to-prevent-kernel-crashes.46206/
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
So, the journey begins. New system specs:
  • SC847 36 bay case with SAS2 backplanes
  • SC X8DTU-F with dual E5645 6 core Xeons
  • 96GB REG ECC ram
  • DELL PERC H200 (V20IT) and SM AOC (V20IT)
    • both the front and rear backplanes are connected to the SM card. I have the Dell card connected to external ports for future needs.
After a small fight with the Dell (unexpected) PERC card, I am off to the races. Ran Memtest86 for about a day, full scan completed no errors. Temperatures were stable.

I have decided (following all best practices I hope) to utilize ESXi with FreeNAS having the HBA adapters passed through using VT-d. Here is the VM spec:
  • FreeNAS 9.10.1-U2
  • 2 vCPUs (high priority, I would probably use 4 in the future)
  • 48GB RAM (locked and reserved)
  • Mirrored boot drives from SATA connected SSDs
I initially tested with 7 old drives that I had laying around in a few different pool configs. I have followed the suggestion of setting Swap=0 in FreeNAS as documented here. Good guide. I put some data on the pools, scrubbed them, pulled drives mid scrub, plugged drives back in, etc. all without any hiccups. Love the fact that if there is a spare present, the resilver starts automatically. Also cool that once you put the original drive back in the case, it seems that the spare is put back to spare duty! Very neat. I tried all of these tests when pushing data at about 100MBps to the pool over rsync. Nothing skipped a beat.

So, I took the plunge and destroyed my backup pool from my old system to free up 6@4TB drives. Little nervous... I then built the new backup pool in my new system using a mix of 3TB and 4TB drives in a 2 vDev by 6 disk Z1 setup.

Code:
backuptank							  4.82T  27.7T	  0	932	  0  94.5M
  raidz1								2.41T  13.8T	  0	491	  0  47.3M
	gptid/8b5394c-972c-11e6-adb1-000c29039f67	  -	  -	  0	146	  0  10.0M
	gptid/8c58ddf-972c-11e6-adb1-000c29039f67	  -	  -	  0	136	  0  9.94M
	gptid/8d62869-972c-11e6-adb1-000c29039f67	  -	  -	  0	143	  0  10.0M
	gptid/8eadb57-972c-11e6-adb1-000c29039f67	  -	  -	  0	134	  0  9.93M
	gptid/891be7d-972c-11e6-adb1-000c29039f67	  -	  -	  0	141	  0  9.99M
	gptid/976366d-972c-11e6-adb1-000c29039f67	  -	  -	  0	142	  0  9.98M
  raidz1								2.41T  13.8T	  0	441	  0  47.2M
	gptid/e12f2cf-972c-11e6-adb1-000c29039f67	  -	  -	  0	139	  0  10.1M
	gptid/eff8701-972c-11e6-adb1-000c29039f67	  -	  -	  0	129	  0  9.88M
	gptid/eb9d24f-972c-11e6-adb1-000c29039f67	  -	  -	  0	134	  0  10.1M
	gptid/e724bfc-972c-11e6-adb1-000c29039f67	  -	  -	  0	124	  0  10.1M
	gptid/e77a4c4-972c-11e6-adb1-000c29039f67	  -	  -	  0	154	  0  10.1M
	gptid/e591501-972c-11e6-adb1-000c29039f67	  -	  -	  0	129	  0  10.1M
--------------------------------------  -----  -----  -----  -----  -----  -----


My original plan was to move the production pool to the new case prior to copying the backups to the new backup pool but I changed my mind and I am using rsync across the network. I prefer the dataset arrangement that I have for my backups over what zfs send would allow for. Needless to say, even at about 90MBps average, it will take some time to copy the data. But, this way I get to enjoy all of my data and services while I figure out ESXi and wait for the data to copy. Once the backup is populated, I will continue to hammer away at the new system for a while to see how it handles stress. I will also swap the 4TB and 3TB drives, one at a time, from backup to production to allow for the production pool to grow. That will also take some time (an I am not a patient person...). When all is said and done, I will have 1 spare 3TB drive and 0 spare 4TB drives. I may have to buy a 4TB spare given how slick the auto-rebuild feature is!!

Not really sure if I am any farther ahead using ESXi but I have nerd like tendencies and wanted to try/learn. I do have FreeNAS USB boot drives for the new server all ready to go if things go pear shaped. From what I have read, given that the LSI cards are passed through, then the pools can be mounted on any other ZFS systems with the correct feature flags. Time will tell if it was a good or bad choice.

Thanks for the earlier help and suggestions. Once I move the production and SSD based pools over, I will have to take a close look at the swap clearing script.

Cheers,
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Do you understand the implications of using raidz1 instead of raidz2?
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
Do you understand the implications of using raidz1 instead of raidz2?
Yes, I do :) I often harp on people for using Z1 as well.

My thoughts on it are that I have all of my data on a Z2 pool and then I have another copy of it on a Z1 pool. In the off chance that I allow my Z2 pool to fail, by ignoring HDD failures etc., then I will be at risk for the time it takes to copy data from the Z1 backup pool to a new Z2 production pool.

My goal is to NEVER allow the Z2 production pool to fail but JIC it does, I do have a backup. Also, if I do something stupid to my production pool (phat finger rm -R or some such), I have a backup. If my backup pool fails, I will fix it and copy the data from the Z2 production pool. I look at the probability that I have 3 or more failures in my production pool AND 2 or more failures in my backup pool at the same time as being small and I accept that level of risk. I hope I do not have to eat craw but such is life. Additionally, any of my data that is truly irreplaceable is located on a cloud storage system.

Thanks for asking the question and making me really think about what I am doing. If you see any major errors in my logic, please let me know.

Cheers,
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
So, I have OCD.

But I am an idiot...

I did not put the drives in bays where they should be when I am done moving the 3TB and 4TB drives between pools... So when I get that finished, I will have to shutdown to move drives to satisfy my OCD requirements...

Ahh, the little things. On a positive note, the Z1 pool resilvers at 1GBps and the Z2 pool seems to be able to do about 555MBps.

Cheers,
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
UPDATE

Drives are thoroughly exercised. Never have I done so much resilvering!! I had a 3TB drive fail on my backuptank but was able to replace it without data loss. Really quick failure too. Went from 8 offline errors, to 16 offline errors to simply offline in a few hours overnight. I really like the replace feature as you effectively add before remove so you never needlessly degrade your pool. Very sweet.

So, I bought some spare 4TB drives and was able to start replacing multiple drives at a time. Had a bit of pucker factor but everything I read said ZFS has no problem resilvering multiple replacements at the same time, some posts even recommended it as you reduce the time in a degraded state if you are replacing multiple failed drives. As soon as one of the 2 vDevs in my production pool was all 4TB drives blamo, the pool grew in size by about 4TB. Totally sweet. Just resilvering the last 3TB-->4TB drive in my production pool now.

And yes, my OCD got the best of me. I moved all the drives around to have them in the proper spots in the case. :)

This ZFS stuff is groovy.
 
Last edited:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Space Aliens ;)
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
So, another 3TB drive failure. Damn those ST3000DM001 drives!!

Resilvered in a spare 4TB. A note on spares: make sure you wipe them or else the system will NOT automatically use them as a spare. Now my OCD layout is all messed up:(

Final layout:

12 4TB in 2x6Z2 production pool
11 3TB and 1 4TB in 2x6Z1 backup pool
2 480GB with 2 120GB (16GB over provision) ZIL mirror

System is fast, stable and total freaking overkill for my personal needs. I love it!!

ESXi is neat. I do look forward to FreeNAS 10 with bhyve.

Last step is to get replication working from the 480GB pool to the production pool to backup the VMs.

Cheers,
 
Status
Not open for further replies.
Top