zfs-native on Ubuntu w/o ECC (I know...)

Status
Not open for further replies.

KevinM

Contributor
Joined
Apr 23, 2013
Messages
106
My main desktop PC is running Linux Mint 14. The motherboard is an Asus p6t, i7 930, 24 GB non-ECC. My /home partition is mounted on a 3ware 9650-4lp hardware RAID controller with a 3 TB RAID5 array. I exported 3 TB on my freenas box that I'm mounting locally over NFS that I use as an rsync backup target.

The issue: I used to have two partitions on the RAID controller and rsyncs from my 2 TB /home partition took about 2.5 hours. But now that /home has the array to itself the rsync takes about 4:40 to run. This is so slow I'm only running it once a week.

I came across an m1015 on Amazon for cheap, well ok $129, and I have a couple of spare 1 TB drives laying around so I can do /home on a RAIDZ2 volume in the Ubuntu box. So I'm thinking when 14.04 ships in April I can install zfs-native on Ubuntu and use zfs send for the backups. I can't imagine that taking almost five hours to replicate.

The question: The Asus p6t does not support ECC memory. Wife is pregnant, so gutting the box for shiny Supermicro bits is not in the cards right now. Is the lack of ECC support enough of a liability that it would outweigh the increase in backup performance?

BTW, I'm using Virtualbox for the virtual machines. The script finds which ones are running and pauses them before running the backup, then unpauses when completed. It was a little tricky to get working so I'm including it in case someone finds it useful.
 

Attachments

  • nightly_backup.txt
    2.7 KB · Views: 237

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Rsync over NFS? You do realize rsync runs its own protocol over network connections on purpose. Pretty sure problem #1 with it being slow is that you aren't using rsync in the way it was designed to minimize network traffic. You shouldn't be needing to do NFS over the network. You should be doing rsync over the network. Of course its going to be slow if you do rsync over NFS. You'll literally be transferring the entire contents of the NFS share over the network with each rsync!
 

KevinM

Contributor
Joined
Apr 23, 2013
Messages
106
Rsync over NFS? You do realize rsync runs its own protocol over network connections on purpose. Pretty sure problem #1 with it being slow is that you aren't using rsync in the way it was designed to minimize network traffic. You shouldn't be needing to do NFS over the network. You should be doing rsync over the network. Of course its going to be slow if you do rsync over NFS. You'll literally be transferring the entire contents of the NFS share over the network with each rsync!

No, this is wrong. Rsync works by checksumming the data to see what's changed, and then only the changes are sent over the network. What you're talking about sounds like scp.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
No, YOU don't know what rsync does. This is also why the rsync tasks in FreeNAS offer rsync module and rsync over ssh as "modes".

The "PUSH" and "PULL" do delta compares using date/time stamps and/or checksumming. PUSH calculates for its data, PULL calculates for its data. The only data exchanged is the stamps/checksumming and the files that contain data found to be out of sync. PERIOD.

In your configuration, you are asking one machine to delta the 2 files and update them accordingly. So your rsync machine reads all of the PUSH data data, then all of the PULL data(over NFS) and then sends the update.

In the first example, if you have 100MB of data that's changed in a 500MB dataset, then about 100MB + some stamps/checksums will go across the network.

In your example, if you have 100MB of data that's changed in a 500MB dataset, then 500MB of data, plus the 100MB of data that's changed, will go across the network. Guess which is faster when you are talking about TB of data? ;)

The whole point of rsync was to minimize traffic across a network connection while ensuring consistency. You've taken that main feature of rsync away by using an NFS share and making your one machine do all of its own rsync checksumming.

If you setup rsync more optimally, as I am trying to explain, then you will see performance increase significantly. And depending on your processors on both ends and where your bottleneck will be with the configuration I'm recommending, you *may* even see rsyncing finish in about 1/2 the time of the original system.

Please read up on how rsync works, in depth. You'll see that you aren't using it in the most productive fashion with your setup. You've removed the potential speed gains by using NFS.
 

KevinM

Contributor
Joined
Apr 23, 2013
Messages
106
No, YOU don't know what rsync does. This is also why the rsync tasks in FreeNAS offer rsync module and rsync over ssh as "modes".

The "PUSH" and "PULL" do delta compares using date/time stamps and/or checksumming. PUSH calculates for its data, PULL calculates for its data. The only data exchanged is the stamps/checksumming and the files that contain data found to be out of sync. PERIOD.

In your configuration, you are asking one machine to delta the 2 files and update them accordingly. So your rsync machine reads all of the PUSH data data, then all of the PULL data(over NFS) and then sends the update.

In the first example, if you have 100MB of data that's changed in a 500MB dataset, then about 100MB + some stamps/checksums will go across the network.

In your example, if you have 100MB of data that's changed in a 500MB dataset, then 500MB of data, plus the 100MB of data that's changed, will go across the network. Guess which is faster when you are talking about TB of data? ;)

The whole point of rsync was to minimize traffic across a network connection while ensuring consistency. You've taken that main feature of rsync away by using an NFS share and making your one machine do all of its own rsync checksumming.

If you setup rsync more optimally, as I am trying to explain, then you will see performance increase significantly. And depending on your processors on both ends and where your bottleneck will be with the configuration I'm recommending, you *may* even see rsyncing finish in about 1/2 the time of the original system.

Please read up on how rsync works, in depth. You'll see that you aren't using it in the most productive fashion with your setup. You've removed the potential speed gains by using NFS.

May I ask what you hope to accomplish with this incessant flaming? You are doing the FreeNAS project a disservice.

Having said that, I already know this isn't the optimal way to do rsync backups. I never for one second argued that it was. As it happens I find it convenient to have the backups locally mounted, and I don't care if it takes two hours to back up my home partition in the middle of the night. However, I do care if it takes four and a half hours. As explained in the original post, this is what happened when I repartitioned the source RAID5 array.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
First, I didn't mean to sound like a flame. It wasn't my intention. I was just trying to explain that your setup will cause slow performance. If you want to keep that configuration, you should accept it. If you want to do something else, you are on your own to accept(or reject) the performance you get. You're the admin. You get to do it any way you want!

Again, sorry. It wasn't my intention to be inflammatory.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
May I ask what you hope to accomplish with this incessant flaming? You are doing the FreeNAS project a disservice.

Look. We get this question a lot. Cyberjock (and to be honest, I as well) don't get gold stars for diplomacy. Here's why:

Cyberjock does this for free. He spends, what, 7-10 hours per DAY answering questions on FreeNAS for no compensation whatsoever. He is under no obligation to swallow his annoyance when a user has pissed him off. I know, it would be "better" if Cyberjock and some other people didn't resort so rapidly to going into semi-flame mode, but it is, what it is. Cyberjock can either be replaced by someone who will perform this much service for the community for free (so far, no takers), or, the FreeNAS people can pay him, and the other subject-matter-experts who answer questions, in which case he would be compelled to be more diplomatic.

So, it is what it is. When people ask him questions, they are getting a very rare access to a subject matter expert for $0, who is in no way affiliated with the project, and who is in no way compensated for his time.

The value he brings is more than demonstrated by the corpus of his posts. If you don't like his attitude, complain to the FreeNAS devs. Then it will be on them to assess whether or not they want to continue having his services with his mouth, or not have his services at all. Until then, do us all a favor, and don't chastise Cyberjock or anyone else for their attitude. They're trying to help you, and they're ornery. That's it.
 

KevinM

Contributor
Joined
Apr 23, 2013
Messages
106
First, I didn't mean to sound like a flame. It wasn't my intention. I was just trying to explain that your setup will cause slow performance. If you want to keep that configuration, you should accept it. If you want to do something else, you are on your own to accept(or reject) the performance you get. You're the admin. You get to do it any way you want!

Again, sorry. It wasn't my intention to be inflammatory.


No problem. I get a little touchy myself sometimes. Let's forget about it.

Anyway I do like having online backups, either as locally mounted partitions or read-only ZFS snapshots. I have a couple of spare hard drives and can swing an M1015 for a 6-drive RAIDZ2, but I can't justify a new PC right now to get ECC. I'm wondering how unsafe ZFS is without ECC, e.g., is it less safe than other filesystems, or is it just not safer anymore?
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
There are a thousand answers to this, all contentious.

I would personally say, and I think cyberjock would largely agree with the statement that "ZFS with inappropriate hardware is potentially more risky than other file systems". A lot of extremely smart people disagree with that statement though, including some of the smartest people we have in irc.

So it's hard to say. ZFS with crap hardware, vs. NTFS? I pick the latter. But I can't defend that decision with science that would withstand hard scrutiny.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Your question is part of a protracted debate. Typically, failed non-ECC RAM means your pool and any if its backups are done for, without exception. Also, ZFS expected RAM to basically be error-resistant.

So, this is where you get to decide how "error prone" your RAM is. If you think it's safe, go for it. I will never build a ZFS system for anyone that doesn't have ECC, nor will I ever recommend non-ECC RAM in a system that uses ZFS. It's personal opinion for the most part.

If it weren't for the fact that your backups get trashed when non-ECC RAM goes bad I'd probably have a different opinion. For me, the fact that you can and will trash your backups no matter what solution you use(online or offline, rsync or zfs replication, manual or automatic) is a big "FAIL" stamp in my book. If somehow the backups wouldn't get damaged I might be a little more wavering in the use of non-ECC RAM with ZFS. But that's not reality, and ZFS isn't ever going to solve that problem. There's already a technical solution for bad RAM that already exists, is well implemented, and the performance penalty is very small. It's called "ECC RAM".
 
Status
Not open for further replies.
Top