Migrating to FreeNAS from Windows server 2008 R2.

Status
Not open for further replies.

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So here's the situation.. I have been tasked with migrating a 2008 R2 file server to FreeNAS. To do this I am going to have to follow a very convoluted way to get there. There's 20TB of data on a 28TB RAID6 array. I will be setting up both machines with jumbo packets and Intel NICs for this transfer. I will have temporary access to 11x3TB hard drives to migrate the data. Here's the steps I will be taking:

1. Use an 11 drive RAIDZ2 temporary FreeNAS server to create a temporary location for the data.
2. Copy all of the data from the 2008 R2 server to FreeNAS.
3. Install/Setup the "new" permanent FreeNAS on the former 2008 R2 server.
4. Copy all of the data from the temporary FreeNAS server back to the "new" permanent FreeNAS server.
5. Destroy the zpool and disassemble the machine as the hard drives will be used elsewhere.

So, what's the fastest way to get all of the data from the 2008 R2 server to FreeNAS and then from 1 FreeNAS machine to the other? I was thinking about CIFS, FTP, Rsync, etc. Even if I got a constant 100MB/sec I'm looking at 92 hours for a complete file copy for 20,060GB of data. Shoo!

Any recommendations for how to accomplish this feat? I was thinking rsync because I could set it up to run-once and let it go until it finishes. Of course, I'll have to wait 4 days or so for it to finish one-way(yuk!). I just figured rsync is probably the most user friendly and possibly the fastest. I just don't know how fast/slow rsync is and I can't find any good info on how rsync performs for file transfers. I've never used it, but I did set it up on 2 VMs on 2 different machines and it was very easy, but slow. Presumably it was slow because it had 3GB of RAM and was virtualized.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
Initial move from Windows to FreeNAS(temp), almost certainly rsync - enabling compression would be worth the experiment if your CPUs on both ends are fast. The other option is CIFS, which is a loooooong way from being quick. And won't let you do another quick rsync after to make sure.

From FreeNAS(temp) to FreeNAS(perm), would you be able to mount the temp array onto the permanent box? If so, do that and use ZFS snapshot and replicate.
Otherwise, ZFS snapshot and replicate over the network - though if you can arrange for a direct network cable connection between the two boxes (on a second NIC port), all the better.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Any reason you wouldn't recommend rsync both ways? I definitely plan to record speeds and time/quantity of data while running the test. The files are not compressible so I won't be using compression.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
Yep - replicate does error checking during transfers, while rsync does not (though it will check CRCs the next time, iff you use option -c and rscyncd on the Windows end).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
And I was trying to figure out if rsync had any kind of error correction for the exact reason you just explained. Now I'm wondering if I should do 2 rsyncs to send the data to the server. Once to get the data there and a second to do the CRC check. Hmm...

I know someone that migrated a server a few years ago using FTP. He lost all the date/time stamps, but alot of files were suddenly corrupt because FTP has no error protection. That didn't go too well for him :P
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
Thus the mention of the second rsync!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Thus the mention of the second rsync!

I have no experience with rsync and I can't find the answer to this question...

When I do the initial rsync will both machines generate their own list of CRCs so that the second rsync will take just a few minutes? I really don't want to do a second rsync if I don't have to if it'll take another 90+ hours to rescan the library of files. There's actually not many files, only about 20k.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
The Windows-end rsyncd *may* cache CRCs temporarily, but it doesn't store them persistently (or at least I've never met one that did) - they're calculated on the fly. I think the sequence goes:
* local:rsync asks remote:rsyncd for file list; with CRC's set, the remote:rsyncd needs to read all its files locally and shares those alongside name+date etc.
* local:rsync compares delivered list against its own; again with CRCs set, the local:rsync needs to read all its files locally
* local:rsync then asks remote:rsyncd to deliver diffs.

That initial phase will take as long as it means to re-read the files at each end, a few hours probably. If you're really lucky the rsyncd might cache them long enough to do the second pass straight afterwards, but don't count on it.

Note that if you don't have an rsyncd at the remote end (ie you mount the SMB share and treat it as a local copy operation but using rsync), I think that means you can't do the CRC comparisons at all.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, I'm going to try to go with rsync and do it twice so it verifies the files are good. Also I will look into doing replication for the return trip if only so we can get a good comparison for which is faster: rsync twice or replication. Last time I tried to use replication I couldn't get it to work quite right. But when it didn't work I didn't mess with it any further since I was on 2 VMs and wasn't sure if that was related.

So I'll reply back in 2-3 weeks when all of the data is moved with info on how it all went.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
Best of luck!
 

ramius

Dabbler
Joined
Oct 30, 2012
Messages
17
Would it be possible to install all the hardrives into your final machine, install freenas, mount your old raid configuration as an NTFS volume in freenas and copy everything localy, from ntfs to the temporally zfs. After you have copied all the data from the ntfs volumes, you can convert the drives (erase, format and create) to the definitive zfs pool and copy everything back.
That's the way I did the migration.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
There aren't enough hard drive bays on the server to add 10 more hard drives nor enough SATA ports. :( I had thought of that despite not trusting NTFS on FreeBSD, but I'd have to use 2x24port controllers and somehow power all these hard drives with a second power supply, no hard drive bays, etc. It's just not really an option :( Trust me, I wish it were. I could move the data at about 350MB/sec if I could do that :P
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, so far this is looking like a much worse nightmare than I had thought. I setup rsync on my temporary FreeNAS server and ran DeltaCopy Client on my Windows machine. It started without a hitch...but... CPU is at 25%(quad core machine) and I am only sending about 30MB/sec. Network utilization sits at a rock solid 26-27% constantly so I'm pretty sure that rsync is CPU bound.

So it looks like rsync is a very bad choice for an initial transfer if speed is a concern. Replication obviously isn't an option at the moment since I still have data on the Windows machine. It looks like my options right now are: to use CIFS/NFS or let rsync do what it wants to do even if it takes... until after the end of the world.

Come to think of it.. it could be that compression is enabled. I don't see any options to enable/disable compression in Deltasync nor do I see any discussion of it via Google.

Edit: Compression is disabled already :(. Whatever is going on, rsync.exe is CPU bound on my Xeon 2.13Ghz machine. I did let it do about 50GB and then did a second sync. It took only 235ms to complete so clearly rsync works on a list of checksums as it goes(which could explain the CPU usage needed).

Edit Again: Looking at the FreeNAS machine the CPU is only 2-6% loaded while rsync is working. I created a test CIFS share to see if I could get better performance. While the rsync was running I got speeds of over 100MB/sec through CIFS! GO F%^*%* figure. So I guess I need to decide if I want to use CIFS or let rsync continue at its abysmal speed. Doesn't CIFS have its own checksums and such to prevent corruption?
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
CIFS has no direct checksums on copy, only TCP packet-layer checksums. Perhaps do a CIFS transfer then run an rsync over the top after?

Or (if you're not pressed for time right away) try other rsync servers on the Windows side. I've not used Deltasync and its results seem pretty rubbish.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Sorry, I guess I should have asked if CIFS has checksums in the packets. :P But you answered that question too.

I believe Deltacopy is just a frontend for rsync.exe. I can't find many that look much better than Deltacopy. It seems to be the most used rsync frontend for Windows. I'll definitely keep looking though.

I did try copying some files using CIFS and then verifying them through rsync. Rsync didn't seem to realize that the files were there and instead copied them again. :( Not sure if there's a way to get rsync to "index" its folder. I rebooted the machine between copying the files with CIFS and trying to initiate the rsync transfer.

Going to look at other rsync options for Windows.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
Rsync checks the filesystem contents as it starts up, but you may need to add the --modify-window=x parameter. The manpage refers to FAT but I've needed it for NTFS also.

"In particular, when transferring to or from an MS Windows FAT
filesystem (which represents times with a 2-second resolution),
--modify-window=1 is useful (allowing times to differ by up to 1
second)."

(Or use '-I' ignore timestamps along with '-c' use checksums, should have the right effect. You are already using '-a' for archive, right? Otherwise datestamps get screwed anyway)

Another thing to try is to push the data from local:Windows to remote:FreeNAS(temp) - might make a difference.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
NOTE: Going to include alot of extra stuff from now on in case someone wants to go back and review this for a project they are working on in the future...

Well, I gave a program called cwRsync 4.0.5 a chance. Basically it's just rsync with a batch file with some helpful notes on how to do it yourself. The default is to just dump the files over to the destination. With the default parameters of '-r' it went to 25% CPU usage and network speed went to 50%. There seemed to be some wall as it would sit right at 49.99% to 50.01% network usage almost then entire time. Looking at CPU usage on the FreeNAS server the CPU usage for rsync went to 25-27%. I'm getting the impression that rsync doesn't do multi-threaded well, if at all, on at least one end but probably both. Google searching seems to support my observation. Anyway, at the end of my test transfer it said:

sent 9811510665 bytes received 46 bytes 53468723.22 bytes/sec
total size is 9810312922 speedup is 1.00

It does seem to reupload the files every time I run the batch file. Obviously this is a complete waste of time if you want to do a second check of rsync later to verify your files are good before you dump your old system.

I reviewed the parameters and added some stuff '-v -c -W' and this seemed to work better, but also not work better. It seems to compile a list of checksums before any transfers begin. But it also doesn't keep a list of checksums. So if I run it once it'll spend hours(probably more like 24+ hours) doing checksum calculations for all of my data, find no files on the destination, then copy everything at 50MB/sec or so. Obviously the average speed including the processing time will be very low. I tried various combinations of the parameters but it just doesn't seem to perform at a decent speed.

Since I've come to the realization that rsync is single threaded it looks like rsync is not a good idea for this project based on the need to finish this in less than 20+ days. I will admit that for backup purposes or syncing where a time frame isn't critical then rsync would have been an excellent fit with DeltaCopy. Just set it up and click go. When it's done you click go a second time to verify all is good and you are done. The only drawback is that you have no indication of % complete, estimated time remaining, or any other useful statistic to give you a clue how long to twiddle your thumbs.

One thing I'd like to note. Deltacopy has a field "Max run time : 259200000 (ms)", or 72 hours. In my case it would have to run for more than 72 hours. I think this means that at the 72 hour mark my transfer would end prematurely and I'd be stuck having to restart the transfer. I didn't see any way to change this setting. I'm also not sure if it will be able to pick up where it left off or if it would start all over.

It looks like my only option now is to just copy the data using CIFS and wait for it to finish. On the plus side, using this will give me an indicator of how much has copied and how much is left.
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
I think I ended up using cwRsync last time I needed to do Windows rsyncing, and that was as a push from Windows to NAS as it didn't have a daemon component. I had completely forgotten until this conversation that I'd gone through half a dozen different Windows rsyncs trying to find one that worked at greater than a snail's pace - it was a few years ago.

I'm bemused about the way you're finding that rsync ignores existing data. That's just broken. Do try again with the --modify-window=1 parameter, if only out of interest!

You can get rsync to give you progress info using -v (or -vv, -vvv etc for more).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm bemused about the way you're finding that rsync ignores existing data. That's just broken. Do try again with the --modify-window=1 parameter, if only out of interest!

You can get rsync to give you progress info using -v (or -vv, -vvv etc for more).

I think that what's happening is CIFS is changing the time stamp on the files when it copies them and rsync defaults to using date/time/file size to determine if the file is identical. If using rsync the date/time is preserved. For CIFS it isn't so rsync is immediately rejecting the "alien" files out of hand. Of course I could tell it to do file checksums, but then I'll have the rather lengthy penalty of waiting for it to checksum every darn file. The I/O really adds up after a while. :P

The reality of this whole thing, in my opinion, is that I'm trying to go from a Microsoft product to a non-Microsoft product. I think that when I try to go back to the new setup with replication all of these small problems will go away

I will try the modify-window=1 and report back.

One thing I found fascinating is that I ran iperf between the 2 machines with a direct crossover cable and was able to achieve just barely 700Mb/sec after over an hour of tweaking Windows NIC settings like enabling/disabling checksums, large send offload, etc. As soon as I did a CIFS test after I had given up and accepted that 700Mb/sec was all I would get CIFS smoked those numbers. In another thread I got into a long discussion about iperf and how it is THE standard for network bandwidth testing. I'm really questioning iperfs functionality for Gb LANs since I was able to use CIFS to beat iperfs best value by more than 20%. At one point I was moving data at 126MB/sec over CIFS as shown on the Windows copy window.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
The --modify-window=1 didn't change anything. It immediately began overwriting my CIFS copied files with different creation dates/times. Rsync just doesn't miss a thing.

Since this entire drive is shared in Windows I wonder if I could mount it in a FreeNAS VM on the Windows server and then do an rsync from the FreeNAS VM to the FreeNAS physical machine. The VM is already setup and I use it to experiment and break FreeNAS.
 
Status
Not open for further replies.
Top