I/O problems?

Status
Not open for further replies.

Junicast

Patron
Joined
Mar 6, 2015
Messages
206
Hi,

I'm quite new to FreeNAS so please forgive if I might have overlooked something.

This is my hardware:
- Gigabyte P67-ud3-b3
- 16 GB RAM
- Intel i5 3470s
- 1 x SSD 100GB for OS
- 5 x 2 TB formatted with RAIDZ and GELI (CPU has AES-NI
- 2 x 16 GB Flash Drive as mirror, but not used, yet.
This is my Software:
- FreeNAS-9.3-STABLE-201503071634

Linux Client Hardware
- i7 2600
- 16 GB RAM
- Fully gigabit Ethernet wired
- 1 TB HDD

Linux Client Software
- Xubuntu 14.04

Problems:
a) When writing lots of data I get very fluctuating transfer rates. Peaks to 115 MB/s but only holding for maximum of 10 seconds. Then transfer rates going down to nearly 0 for a few seconds. I have transferred a couple of TB and it's always keeps to be volatile. Already played a bit with rsize and wsize but server always overwrites setting to 64k.
The source is surely not the problem. Did make a transfer from the Linux Client to a different fileserver in the same network and I get like constant ~ 100 MB/s
-> Screenshots are showing this with file transfers of mostly larger files.

b) The WebGUI is often sluggish, especially when going to jails or plugins configuration. When the server is
under what load, the plugin sites sometimes time out and won't display any results.
messages:
Mar 8 20:37:28 filer manage.py: [freeadmin.navtree:560] Couldn't retrieve https://10.10.101.101/plugins/transmission/3/_s/treemenu: timed out

c) When using the GUI and clicking through then menues (again mostly plugins or jails) a single client playback of a simple audio file gets stalled, because I just clicked in the GUI to load Plugins list e.g.

I'm simply not able to determine the source of the problem. My CPU is not really getting busy. To me it mostly looks like an I/O problem, but WHY?
 

Attachments

  • Screenshot - 08.03.2015 - 21:01:39.png
    Screenshot - 08.03.2015 - 21:01:39.png
    42.8 KB · Views: 277
  • Screenshot - 08.03.2015 - 21:00:12.png
    Screenshot - 08.03.2015 - 21:00:12.png
    22.4 KB · Views: 274

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Well,

Things that come to mind: Aside from the fact that you have sufficient RAM, everything else about your hardware is like a crash-course on what not to do for FreeNAS. That board has a P67 chipset (near the bottom of the list for what we'd recommend), and it has a RealTek 8111E LAN port--which is what I assume you're using--and that *IS* at the bottom of the list of what we'd recommend. Literally, that is at the bottom of the list of onboard NICs that we suggest for FreeNAS, because of their poor performance. But I'd still expect things to 'work'.

Let's go ahead and say, with respect to your network throughput issues, "sorry, sir, but you're on your own there; you are using precisely the right NIC that experience shows will have poor or spotty performance with FreeNAS".

However, the type of hanging that you're talking about is totally out of proportion to anything reasonable. I run FreeNAS on a $45 G3220 CPU, and everything is blink-of-eye.

Things I might check:

* Full long SMART tests on any drive in the pool. If the drives have some problems, everything may get slow.
* Make sure there's no corruption of your system files. There is a "verify install" button somewhere...(I think it's in system->boot or system->upgrade).
* Rule out network problems.
* Rule out a bad network cable, rule out the wrong ethernet auto-negotiate/duplex settings.

I hope this is helpful. The hanging up, just when moseying around the GUI, is totally incorrect and should not be happening. Something is wrong, and I'd start with hardware and network issues before I started jacking around with FreeNAS's configuration.
 

Junicast

Patron
Joined
Mar 6, 2015
Messages
206
I'm actually using two Intel Corporation 82574L Gigabit adaptors. Forgot to mention, sorry :smile:
They are connected to my switch with a LAG (lacp) and vlan tagging.
I'd be willing to exchange my mainboard with a better one, could you point me to a hardware compatibility list, please. I did not know the P67 was that bad :smile:

I will make sure that network is not the cause.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm actually using two Intel Corporation 82574L Gigabit adaptors. Forgot to mention, sorry :)
They are connected to my switch with a LAG (lacp) and vlan tagging.
I'd be willing to exchange my mainboard with a better one, could you point me to a hardware compatibility list, please. I did not know the P67 was that bad :)

I will make sure that network is not the cause.

It's not that P67 is bad. It just doesn't support ECC RAM, which is strongly recommended.

The hardware recommendations sticky has pretty much all you need to know. If you decide to buy a Supermicro X10 motherboard, I have a guide that explains the differences between the models (check the X10 FAQ). Links are in my signature.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Remove the configuration of lagg and see if things get better.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Peaks to 115 MB/s but only holding for maximum of 10 seconds.

Hey there "overflowing transaction group" - so nice to see you again. /sarcasm

This issue crops up when your disks can't keep up with your network ingest rate. Which is odd, because even though RAIDZ is slow on random I/O it's normally able to suck down sequential traffic without much trouble, and four data disks (I'm assuming you're RAIDZ1 here) should be able to sustain more than 115MB/s sequential write. That said, any kind of random I/O or seeks there will screw that right up.

Have you benchmarked the pool directly from an SSH command line to see what the sequential throughput is? Don't use /dev/zero as a source if you're using compression though.
 

Junicast

Patron
Joined
Mar 6, 2015
Messages
206
Yes, RAIDZ1.
I ran
"dd if=/dev/random of=/mnt/volume1/storage1 bs=1024 count=500000" and got ~ 55 MB/s
"dd if=/dev/zero of=/mnt/volume1/storage1 bs=1024 count=500000" and got ~ 165 MB/s
/mnt/volume1/storage1 is not using any compression but filesystem encryption.

That's kind of weird to me.
On a volume with lz4 enabled I also get only around 170 MB/s.

Do my settings suck or my testing methods?
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
/dev/zero is a bad test because ZFS won't actually bother writing long sections of zeroes.

/dev/random is a decent test as long as your CPU isn't a bottleneck.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
With compression=off ZFS should write the zeroes as-is. Anything other than "off" will nuke it though.

You're also only writing 1024*500000 bytes or about 488MB so that will fit entirely into a transaction group and you won't see the stall. Try writing four times that much and see if it drops dead.

55MB/s from /dev/random on an i5 with AES-NI seems slow to me, so encryption throughput may be an issue. Try piping /dev/random to /dev/null and see what you get.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
/dev/random is totally useless as a benchmark for everyone on the planet that uses FreeBSD. /dev/random is single threaded, and even the fastest single-threaded CPUs can't even break 150MB/sec.

You want to do a throughput test, you have to turn off compression and use /dev/zero. PERIOD. That is and has been for a very long time (16 months?) the *only* way. Why do you think I advocated against compression so much when 9.2.0 was in beta and lz4 became the default?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
/dev/random is totally useless as a benchmark for everyone on the planet that uses FreeBSD. /dev/random is single threaded, and even the fastest single-threaded CPUs can't even break 150MB/sec.

His i5 should still be able to barf out more than 55MB/s worth of /dev/random. I've got an X5650 here and it pulled about 80MB/s, and judging from dmesg | grep aes I don't even have AES-NI enabled.

@Peter Brille can you confirm that AES-NI is enabled in your BIOS?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Why do you think that AES-NI has anything to do with /dev/random? It doesn't.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
"His i5 should still be able to barf out more than 55MB/s worth of /dev/random" you can easily test this with dd with if=/dev/random of=/dev/null ;)

And I agree with cyberjock, the standard R/W tests used here are:
Code:
dd if=/dev/zero of=/mnt/tank/tmp.zero bs=2048k count=50k #disable shares and compression for the test!!!
dd if=/mnt/tank/tmp.zero of=/dev/null bs=2048k count=50k #disable shares and compression for the test!!!
 

Junicast

Patron
Joined
Mar 6, 2015
Messages
206
dd from zero gives 338 MB/s
dd to zero gives me 364 MB/s

% dmesg |grep aes
aesni0: <AES-CBC,AES-XTS> on motherboard
aesni0: <AES-CBC,AES-XTS> on motherboard
aesni0: <AES-CBC,AES-XTS> on motherboard

this makes more sense to me, so it must be somewhere in the network I guess. I will figure it out, thank you @all.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
"His i5 should still be able to barf out more than 55MB/s worth of /dev/random" you can easily test this with dd with if=/dev/random of=/dev/null ;)

Really? I did all this testing back before lz4 was made the default. I even tried this at iX with VERY expensive CPUs and never saw 150MB/sec on any of them.

Here's my Xeon E3-1230v2 with the system idle...
Code:
~# dd if=/dev/random of=/dev/null bs=1M count=10k
10240+0 records in
10240+0 records out
10737418240 bytes transferred in 106.427522 secs (100889488 bytes/sec)


It also can change heavily based on how much entropy you can harvest. If there's no entropy the /dev/random device can literally end up stalled for long periods of time. I've seen Ivy Bridge Xeons average just 40MB/sec because there was nothing to use to harvest entropy.

Then, you have the nasty CPU usage that comes with /dev/random. Typically one core is maxed out. On a system with two cores and you are trying to benchmark ZFS you *need* the workloads to not cause unnecessary hardship on the CPU. Even with 4c/8t I'll get artificially low values if I max out a CPU and then do ZFS benchmarks.

I stand by my argument that /dev/random sucks, that it can't perform adequately to give good throughput, and will likely never be a good tool for benchmarking zpools. Especially since zpools are getting faster at a faster rate than /dev/random's performance is. ;)

This discussion, here in the FreeNAS forums, was all done and over in late 2013 or early 2014. /dev/random is a poor choice for benchmarking. In fact, one of the old stickies from 2011 (one of the very first stickies in the forum) even said that /dev/random was terrible and to not use it. This has been standard knowledge for a very long time.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Note that I'm not the author of the sentence between the double quotes, I just quoted HoneyBadger (yeah I know, I should use the quote tags, it's laziness I guess) ;)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Note that I'm not the author of the sentence between the double quotes, I just quoted HoneyBadger (yeah I know, I should use the quote tags, it's laziness I guess) ;)

Lazy bum! Here.. have another beer ya bum! ;)
 
Status
Not open for further replies.
Top