slow write performance on freenas mini

Status
Not open for further replies.

Vick Khera

Dabbler
Joined
Jul 8, 2015
Messages
17
I'm putting a FreeNAS mini in place of an older hand-built server I had running stock FreeBSD 9.3 for some data archiving. The writes on the FreeNAS box are much much slower than on the old box when archiving my data.

The old system is a 12GB RAM Xeon 5130 2.0GHz Dell with 4x 1TB Hitachi SCSI drives on an LSI (non-raid) controller.

The new system has 32GB RAM Atom C2750 2.40GHz processor and 4x 4TB WD RED drives on the motherboard SATA 3.

The nightly archive comes from my data center to the main office over a VPN connection that can saturate the raw link speed. When archiving to the old server, this rsync copy would take approximately 2-2.5 hours at 4am. When archiving to the FreeNAS Mini box, it is taking close to 5 hours.

At first I suspected it was the network so I tested copying a file from the old box (on the same switch as the Mini):

[filer]% scp yertle:/u/network-backups/old/database-dumps/rt4/rt4.rtmaster.2015-07-20.dump .
rt4.rtmaster.2015-07-20.dump 100% 2437MB 31.3MB/s 01:18
[filer]% scp yertle:/u/network-backups/old/database-dumps/rt4/rt4.rtmaster.2015-07-20.dump /dev/null
rt4.rtmaster.2015-07-20.dump 100% 2437MB 40.0MB/s 01:01

So when I write to /dev/null, I get almost 25% better performance. Is the Mini expected to cap out on a single thread write at 32-ish MB/s? The other day I was capping out at about 12Mb/s and it made moving files from the old machine to the Mini *really* take a long time.

Based on hardware specs alone, I'd expected the Mini to blow away the old box on write speed. The Mini has encryption on the disks using GELI. Is that enough to make it this ridiculously slow?

I've read over the forums for speed issues, but I did not find anything that was suitable for me to try. There's no hyperthreading on this processor, and the sysctls are all seemingly in order, and the network is not "busted".
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Your performance seems pretty slow in comparison with my Mini's performance.

Can you post a debug file? System -> Advanced -> Save Debug
 

Vick Khera

Dabbler
Joined
Jul 8, 2015
Messages
17
Here's the debug file. Anything in particular I might look for?
 

Attachments

  • debug-filer-20150728093140.tgz
    449.8 KB · Views: 283

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Not really. I just look through everything and look for anything that seems out of place, configured in some abnormal way, etc.

Let me look through the debug file. I'll reply in a few minutes.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Here's notes on what I saw:

Regular SMART short tests are being performed, but no long tests in over 600 hours. I'd have expected at least one in 600 power-on hours.
The boot device was formatted in April, but had its first scrub just a few weeks ago. Is the scrub schedule setup for the boot device to something reasonable?

Nothing appears wrong or even out of place.

We don't use scp for testing in the forums for throughput because things like the encryption used for the ssh tunnel have a serious impact on performance.

Try creating a dataset that has no compression set and make a testfile with dd. Something like "dd if=/dev/zero of=testfile bs=1M count=100k" should be fine. Then do the reverse, "dd if=testfile of=/dev/null bs=1M count=100k". Post the output of those two commands.

rsync has lots of settings that you can play with. You can choose to do checksum compare or not, you can choose to compare by date/time stamps or not. If you do choose checksum compare then rsync tasks can take *significantly* longer. And by significantly I mean orders of magnitude more in some cases.

How specifically is the backup at 4AM being done? Over scp? Over NFS? Over rsync? You should be performing your test using the same protocol that you are having issues with.

If you are capping out at 12MB/sec, that's suspiciously close to 100Mb throughput. Could you have a networking problem with cabling or the switch?
 

Vick Khera

Dabbler
Joined
Jul 8, 2015
Messages
17
The system was purchased in June. I suppose the root pool was built when the machine was built and sitting in a box for a couple of months. It has not been running one full month yet, which is the default zfs scrub time (I think it may be 35 days in the scrub script). The main pool was re-created on July 7.

The nightly backups are done via rsync over ssh. The same rsync to my old freebsd box took half as much time, which is why I am concerned.

I was suspecting network problem, but igb0 says it is 1000baseT according to ifconfig. The switch also shows Gig-E connection. Just for kicks, I moved the port on the switch.

Here's the dd test output:

[filer]% dd if=/dev/zero of=testfile bs=1M count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 421.468363 secs (254762141 bytes/sec)
[filer]% dd if=testfile of=/dev/null bs=1M count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 291.218825 secs (368706187 bytes/sec)

The disks seem fast enough... I was observing upwards of 120MB/s writes and reads sustained per drive during these tests.
 

Vick Khera

Dabbler
Joined
Jul 8, 2015
Messages
17
Hmmm... am I double-encrypting my drives?

I see this on the dmesg.boot file:

GEOM_RAID5: Module loaded, version 1.3.20140711.62 (rev f91e28e40bf7)
GEOM_ELI: Device gptid/7539c499-24dc-11e5-aaaa-d05099648992.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI: Crypto: hardware
GEOM_ELI: Device gptid/75b2585d-24dc-11e5-aaaa-d05099648992.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI: Crypto: hardware
GEOM_ELI: Device gptid/7627f3ae-24dc-11e5-aaaa-d05099648992.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI: Crypto: hardware
GEOM_ELI: Device gptid/769c25e2-24dc-11e5-aaaa-d05099648992.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI: Crypto: hardware

GEOM_ELI: Device ada0p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI: Crypto: hardware
GEOM_ELI: Device ada1p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI: Crypto: hardware
GEOM_ELI: Device ada3p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI: Crypto: hardware
GEOM_ELI: Device ada4p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI: Crypto: hardware

The "gptid" devices in the first hunk of ELI logs are the names of my zpool devices, and the ada devices are of course the raw hardware devices.

tank ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gptid/7539c499-24dc-11e5-aaaa-d05099648992.eli ONLINE 0 0 0
gptid/75b2585d-24dc-11e5-aaaa-d05099648992.eli ONLINE 0 0 0
gptid/7627f3ae-24dc-11e5-aaaa-d05099648992.eli ONLINE 0 0 0
gptid/769c25e2-24dc-11e5-aaaa-d05099648992.eli ONLINE 0 0 0


[filer]% glabel status
Name Status Components
gptid/7539c499-24dc-11e5-aaaa-d05099648992 N/A ada0p2
gptid/75b2585d-24dc-11e5-aaaa-d05099648992 N/A ada1p2
gptid/e993e767-ed30-11e4-b4f3-d05099648992 N/A ada2p1
gptid/7627f3ae-24dc-11e5-aaaa-d05099648992 N/A ada3p2
gptid/769c25e2-24dc-11e5-aaaa-d05099648992 N/A ada4p2


[filer]% geli status
Name Status Components
gptid/7539c499-24dc-11e5-aaaa-d05099648992.eli ACTIVE gptid/7539c499-24dc-11e5-aaaa-d05099648992
gptid/75b2585d-24dc-11e5-aaaa-d05099648992.eli ACTIVE gptid/75b2585d-24dc-11e5-aaaa-d05099648992
gptid/7627f3ae-24dc-11e5-aaaa-d05099648992.eli ACTIVE gptid/7627f3ae-24dc-11e5-aaaa-d05099648992
gptid/769c25e2-24dc-11e5-aaaa-d05099648992.eli ACTIVE gptid/769c25e2-24dc-11e5-aaaa-d05099648992
ada0p1.eli ACTIVE ada0p1
ada1p1.eli ACTIVE ada1p1
ada3p1.eli ACTIVE ada3p1
ada4p1.eli ACTIVE ada4p1
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
You should also try testing performance over the local network. This will eliminate the VPN as a variable.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah.. everything looks okay.
 

Robert Smith

Patron
Joined
May 4, 2014
Messages
270
What disks did the old server have exactly?
It is not unreasonable to expect 15K SCSI disks to be twice as fast as 5K4 SATA… Just thinking out loud…

Furthermore, maybe it is unfair, but in my mind the first association to Atom is “slow as molasses.” Take a look at the following thread, if you have not seen it already:
https://forums.freenas.org/index.php?threads/how-much-does-the-processor-affect-performance.15813/

I was trying to find encrypted pool with parity benchmarks for C2750; and it made me realize how hard it is to search these forums due to finding folks’ signatures all the time. Maybe somebody else has stronger search Kong-Fu. LOL. Or maybe you are the first to benchmark this combination.

There is a fellow reporting “90 to 100 mbps” (which would translate to 11 to 13 MB/s, if he meant Megabits per second) in the following thread, but he was doing the test with one 3GB file, which would not be enough for a sustained throughput test.
https://forums.freenas.org/index.ph...558f-freenas-compatibility.17462/#post-100137
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I just want to put this out there...

1. Your dd speeds look just about what I'd expect.
2. I don't consider your scp benchmarks to really mean much of anything (sorry). scp was never meant to be fast. It was meant to be secure through SSH, and that encryption has overhead. scp is also very sensitive to latency. A small amount of latency translates to a crapload of lost throughput. Also rsync doesn't *have* to run through SSH tunnels, you could do rsync modules. That will likely be faster, but lacks the security which you may or may not care about. I know when I've copied files over scp, even when using my Xeon E1230-v2, I don't think I've ever seen 50MB/sec. So I tend to think that the scp throughput is just crappy slow. Also see the next bullet.
3. I'm not convinced we are talking apples-to-apples when you compare the Mini to the old server. The way FreeNAS handles the scp connections, the bits of encryption and type of encryption, and other factors likely make this an apples-to-oranges comparison. Now I agree that ultimately the choices of encryption and other factors matter less to you than "I used system A for X and I want to use system B for X but it is slower" but I don't think that's a fair comparison in the bigger picture. FreeNAS' devs have already chosen the type of encryption, the number of bits for encryption, and other factors for you. They may be conservatively choosing something that is almost unhackable while your FreeBSD 9.3 system may have very low bits of encryption or uses a type of encryption that is not secure (read: not CPU intensive).
4. GELI uses the AES-NI instruction set. Unless yours is disabled in the BIOS, GELI's performance penalty should be nearly non-existent.

What you might want to do is look at top and see what your CPU usage is doing during an scp transfer (or better yet during an rsync transfer). That will give some hints about whether you are CPU limited or not.
 
Status
Not open for further replies.
Top