Low Disk Performance

brando56894 · Nov 24, 2015

I've finally decided to benchmark my current setup after getting everything in place and have been noticing relatively slow write speeds to my mirrored pool. I just ran a DD test on an uncompressed dataset and the speeds are pretty weak for 4 pairs connected to a PCI SAS controller. Granted most of the drives are 5400 RPM (2x1 TB WD Red 2.5", 3x3 TB WD Green) and 3 are 7200 RPM are 4 TB HGST Deskstar NAS drives, but that shouldn't affect performance that much. Transferring large files (a gigabyte to a few gigabytes) from one pool to another (inside a jail) is even worse, and hovers around 20-30 MB/sec, SMB transfers hover around 50 MB/sec also. As can be seen below, read speeds are pretty good at 370 MB/sec but write speeds are about 15% of that.

Multimedia Write

Code:

[root@freenas] /mnt/Multimedia/Test# dd if=/dev/random of=test.img bs=2048k count=50k
51200+0 records in
51200+0 records out
107.3741824 GB transferred in 37.12 minutes (48.211989 MB/sec)
[root@freenas] /mnt/Multimedia/Test#

Multimedia Read

Code:

[root@freenas] /mnt/Multimedia/Test# dd if=test.img of=/dev/null bs=2048k
51200+0 records in
51200+0 records out
107.3741824 GB transferred in 4.84 Minutes (370.076911 MB/sec)

For comparison sake I did the same test on my Striped Pool of 2x74 GB WD Raptors (10K RPM, attached to Intel Sata Controller on Motherboard) and a single Toshiba drive that I have, and it doesn't look like the problem is with the pool or the SAS controller, but maybe with FreeNAS itself (? can't think of what else could be the issue). Instead of being identical, shouldn't write speeds be vastly different between a mirrored pool of 4 sets, a stripe with 2 10K drives in it, and a single drive, connected to different controllers?

Downloads Write

Code:

[root@freenas] /mnt/Downloads/Nzbget# dd if=/dev/random of=test.img bs=2048k count=25k
25600+0 records in
25600+0 records out
53.6870912 GB transferred in 18.53 minutes secs (48.296524 MB/sec)

Downloads Read

Code:

[root@freenas] /mnt/Downloads/Nzbget# dd if=test.img of=/dev/null bs=2048k
25600+0 records in
25600+0 records out
53.6870912 GB transferred in 6.97 minutes (128.438269 MB/sec)

Stuff Write

Code:

dd if=/dev/random of=test.img bs=2048k count=25k
25600+0 records in
25600+0 records out
53.6870912 GB transferred in 18.52 minutes (48.320039 MB/sec)

Stuff Read

Code:

[root@freenas] /mnt/Stuff/Downloads# dd if=test.img of=/dev/null bs=2048k
25600+0 records in
25600+0 records out
53.6870912 GB transferred in 13.42 minutes (66.663005 MB/sec)

Nick2253 · Nov 24, 2015

The fact that you're seeing the same speeds across different drive configurations almost assuredly means you have a bottleneck somewhere in the system.

Have you already transferred data to the pool(s)? If you can wipe them out without much difficulty, we can rule out hardware by running some tests from a Linux LiveCD (I say wipe out the pools, because that way you know it isn't something related to BSD or ZFS that's creating problems). If you can't wipe them out, you can at least try from a fresh FreeNAS install.

Usually, when we see these problems, it's one of two things: hardware problems, or people following outdated guides and "tweaking" their system unnecessarily.

HoneyBadger · Nov 24, 2015

/cocks head

So hold on. Your "Multimedia" pool is composed of these disks?

(2x1 TB WD Red 2.5", 3x3 TB WD Green) and 3 are 7200 RPM are 4 TB HGST Deskstar NAS drives

Assuming they're in mirror vdevs per your signature, I'm betting that ZFS is just having a hell of a time splitting I/O across those highly disparate vdevs. (And I really hope you wdidle'd the Greens.)

I'd also dump the SMART data (if you post it, please put it in an attachment) and check to see if any of them are reporting SMART errors.

brando56894 · Nov 24, 2015

Nick2253 said:
The fact that you're seeing the same speeds across different drive configurations almost assuredly means you have a bottleneck somewhere in the system.

Have you already transferred data to the pool(s)? If you can wipe them out without much difficulty, we can rule out hardware by running some tests from a Linux LiveCD (I say wipe out the pools, because that way you know it isn't something related to BSD or ZFS that's creating problems). If you can't wipe them out, you can at least try from a fresh FreeNAS install.

Usually, when we see these problems, it's one of two things: hardware problems, or people following outdated guides and "tweaking" their system unnecessarily.

I can't touch the Multimedia pool since that has about 6.5 TB worth of data on it, I already wiped it once because it was RAIDZ and I had to discard about 3 TB because I had no place to put it and I'm finally just re-acquiring stuff (this was when I first noticed the slowness). Downloads can easily be wiped since it's just temporary storage for when NZBget has to download, join and extract the files (the finished file gets copied to Multimedia) and pretty much has nothing on it. Storage could also be easily wiped since it only has 3 Jails on it which could easily be relocated to Multimedia, and a bunch of seeding torrents.

I can load up an Arch ISO via IPMI, what kind of tests would you suggest? At first I was thinking it may be the backplane but then I remembered that the 2 WD Reds are directly connected to the SAS controller so that can't be the issue, also the read speeds are fine so IMO it would be odd for it to only be faulty in one direction and not the other. I don't think I have reinstalled completely since I first got my NAS in February, I'm using the Nightly 9.3 train and I update a few times a month, maybe I should switch back to the stable train to see if that resolves anything (going to have to wait a few days though).

HoneyBadger said:
/cocks head

So hold on. Your "Multimedia" pool is composed of these disks?

Assuming they're in mirror vdevs per your signature, I'm betting that ZFS is just having a hell of a time splitting I/O across those highly disparate vdevs. (And I really hope you wdidle'd the Greens.)

I'd also dump the SMART data (if you post it, please put it in an attachment) and check to see if any of them are reporting SMART errors.

Yea that's how they're setup. I plan on getting rid of the pair that's on Green and one HGST since I know that can't be good, it also has a bad sector, but money has been kinda tight and I should be able to replace it by the end of December. I plan on replacing them all with 4 TB HGST drives but I don't have the grand or so required to do so haha. I did de-idle them a while ago because I was using these drives in software RAID in my Linux desktop before I bought the NAS so that's not the issue. Your thought about splitting I/O would make sense if it was just the Multimedia pool, but the odd thing is that this is the max for all drives, including the 2 Raptors and the single Toshiba drive, so that kinda rules that out in my opinion. The transfer rate is limited by something, but the question is what, to me it seems like something software related and not hardware related.

Here are the SMART results

Stuff (Single Drive): Toshiba

Downloads (2xRaptors): Drive 1 Drive 2

Multimedia (Mirrored Pool): Red 1 Red 2

Green 1 Green 2 Green 3 (Bad Sector)

HGST 1 HGST 2 HGST 3

Nick2253 · Nov 24, 2015

I'd just do a dd test. Also, have you tried using different block sizes? I'm wondering if that's part of the problem.

My thoughts on testing with Linux are to get away from three possible issues: FreeNAS config, BSD/BSD drivers, and ZFS. If you see similar slow speeds, then your problem is most likely your hardware. If you see significantly faster speeds, your problem is likely a configuration issue somewhere along the way.

brando56894 · Nov 24, 2015

Yea that's what I figured. I'm at work now and port 9000 is blocked so I can't access IPMI Java interface so it's going to have to wait a few hours, although I can continue to test with different block sizes to see if anything changes.

HoneyBadger · Nov 24, 2015

brando56894 said:
Yea that's how they're setup. I plan on getting rid of the pair that's on Green and one HGST since I know that can't be good, it also has a bad sector, but money has been kinda tight and I should be able to replace it by the end of December. I plan on replacing them all with 4 TB HGST drives but I don't have the grand or so required to do so haha. I did de-idle them a while ago because I was using these drives in software RAID in my Linux desktop before I bought the NAS so that's not the issue. Your thought about splitting I/O would make sense if it was just the Multimedia pool, but the odd thing is that this is the max for all drives, including the 2 Raptors and the single Toshiba drive, so that kinda rules that out in my opinion. The transfer rate is limited by something, but the question is what, to me it seems like something software related and not hardware related.

Just noticed that you're pulling from /dev/random - have you tried disabling compression and pulling from /dev/zero to make sure that "generating the random data" isn't your bottleneck here?

That wouldn't explain why inter-jail copying is slow; but that might be Ye Olde Oversized Transaction Group coming to bite you - in fact, with it being local async copy that might be happening here. 32GB / 8 = 4GB txg in 5 seconds, and none of those pools are going to sustain 800MB/s ingest.

Here are the SMART results

(trimmed)

I don't know what's going on there but all of those links are going to .vbs and .rb files and I'm not touching that with a barge pole.

brando56894 · Nov 24, 2015

HoneyBadger said:
Just noticed that you're pulling from /dev/random - have you tried disabling compression and pulling from /dev/zero to make sure that "generating the random data" isn't your bottleneck here?

That wouldn't explain why inter-jail copying is slow; but that might be Ye Olde Oversized Transaction Group coming to bite you - in fact, with it being local async copy that might be happening here. 32GB / 8 = 4GB txg in 5 seconds, and none of those pools are going to sustain 800MB/s ingest.

Haven't tried that yet, I was trying to use /dev/zero before but it was giving me an error (which seemed to be related to a flag in the end) so I went with /dev/random, it's entirely possible that that's the case. I'll try it again with /dev/zero, I changed the blocksize as Nick suggested above and that didn't change anything.

I don't know what's going on there but all of those links are going to .vbs and .rb files and I'm not touching that with a barge pole.

Understandable, but if you're on Linux VBS files wouldn't do anything hahaha It must've thought that the files were VBS and Ruby files since it did syntax highlighting, hastebin is pretty cool. They're safe, I ran wget to grab the file and then cat'd it and here's the output to prove it's nothing malicious :)

Code:

[root@freenas] /mnt/Stuff/Downloads# wget http://hastebin.com/vitamisete.rb
converted 'http://hastebin.com/vitamisete.rb' (US-ASCII) -> 'http://hastebin.com/vitamisete.rb' (UTF-8)
--2015-11-24 15:38:45--  http://hastebin.com/vitamisete.rb
Resolving hastebin.com (hastebin.com)... 107.20.190.74
Connecting to hastebin.com (hastebin.com)|107.20.190.74|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1956 (1.9K) [text/html]
Saving to: 'vitamisete.rb'

vitamisete.rb  100%[========================================================================>]  1.91K  --.-KB/s  in 0s   

2015-11-24 15:38:46 (89.6 MB/s) - 'vitamisete.rb' saved [1956/1956]

[root@freenas] /mnt/Stuff/Downloads# cat vitamisete.rb
<html>

   <head>

     <title>hastebin</title>

     <link rel="stylesheet" type="text/css" href="solarized_dark.css"/>
     <link rel="stylesheet" type="text/css" href="application.css"/>

     <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
     <script type="text/javascript" src="highlight.min.js"></script>
     <script type="text/javascript" src="application.min.js"></script>

     <meta name="robots" content="noindex,nofollow"/>

     <script type="text/javascript">
       var app = null;
       // Handle pops
       var handlePop = function(evt) {
         var path = evt.target.location.pathname;
         if (path === '/') { app.newDocument(true); }
         else { app.loadDocument(path.substring(1, path.length)); }
       };
       // Set up the pop state to handle loads, skipping the first load
       // to make chrome behave like others:
       // http://code.google.com/p/chromium/issues/detail?id=63040
       setTimeout(function() {
         window.onpopstate = function(evt) {
           try { handlePop(evt); } catch(err) { /* not loaded yet */ }
         };
       }, 1000);
       // Construct app and load initial path
       $(function() {
         app = new haste('hastebin', { twitter: true });
         handlePop({ target: window });
       });
     </script>

   </head>

   <body>
     <ul id="messages"></ul>

     <div id="key">
      <div id="pointer" style="display:none;"></div>
       <div id="box1">
         <a href="/about.md" class="logo"></a>
       </div>
       <div id="box2">
         <div class="save function"></div>
         <div class="new function"></div>
         <div class="duplicate function"></div>
         <div class="raw function"></div>
         <div class="twitter function"></div>
       </div>
       <div id="box3" style="display:none;">
         <div class="label"></div>
         <div class="shortcut"></div>
       </div>
     </div>

     <div id="linenos"></div>
     <pre id="box" style="display:none;" tabindex="0"><code></code></pre>
     <textarea spellcheck="false" style="display:none;"></textarea>

   </body>

</html>

brando56894 · Nov 24, 2015

New results using /dev/zero, looks like the bottleneck was indeed /dev/random!

Code:

[root@freenas] /mnt/Stuff/Downloads# dd if=/dev/zero of=test.img bs=2048k count=25k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 1142.688639 secs (46.983132 MB/sec)

Code:

[root@freenas] /mnt/Multimedia/Test# dd if=/dev/zero of=test.img bs=2048k count=25k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 335.333502 secs (160.100589 MB/sec)

Rsync shows that copying the test file from /mnt/Multimedia/Test to /mnt/Downloads/Nzbget (both are uncompressed datasets) is transferring at a rate of anywhere between 25 MB/sec and 220 MB/sec, haven't had a chance yet to do it inside a jail.

edit: From inside my UsenetApps jail (which lives in my Storage pool (single disk) ) I transferred the test.img from the Downloads pool to the Multimedia pool (both uncompressed datasets) and everything looks fine so far, maybe compression is the problem? Going to test that next.

Code:

 [bran@UsenetApps /usenet]$ rsync -a --progress nzbget/downloads/test.img /mnt/test/
sending incremental file list
test.img
53,687,091,200 100%  111.30MB/s    0:07:40 (xfr#1, to-chk=0/1)

Looks fine to me. I don't know why I get such slow speeds sometimes

Code:

 [bran@UsenetApps /usenet]$ rsync -a --progress nzbget/downloads/test.img /mnt/tv/
sending incremental file list
test.img
 53,687,091,200 100%  111.32MB/s    0:07:39 (xfr#1, to-chk=0/1)

brando56894 · Nov 25, 2015

Just found a prime example of the crappy speeds. I'm trying to copy 400 GB or so from one dataset to another on the same disk (the Storage pool) and it's maxing out at a whopping 14 MB/sec!

Code:

[root@freenas] /mnt/Stuff/Jails/UsenetApps/usr/local/etc/transmission/home/Downloads# rsync -a --progress --remove-source-files * /mnt/Stuff/Downloads
sending incremental file list
  1,465,879,631 100%   14.43MB/s    0:01:36 (xfr#1, to-chk=958/960)

HoneyBadger · Nov 26, 2015

You're simply asking way too much from that one spindle. Newer rsync limits block size to 128KB so you're asking that one disk to go "read this 128KB, write that 128KB, read this, write that" over and over.

If you're going from a single disk back to the same single disk, no amount of wizardry will stop the fact that performance will be poor there.

brando56894 · Nov 26, 2015

Thanks, I knew it would be slow, but didn't think it would be that slow.

HoneyBadger · Nov 26, 2015

brando56894 said:
Thanks, I knew it would be slow, but didn't think it would be that slow.

I'm impressed that it's pulling ~14MB/s at all. In terms of IOPS, you're doing a read and write per 128KB rsync'd, so that's ~230 IOPS, about twice what you'd expect from a single spindle. I imagine that batching up the writes is helping it out and it's actually being a very bursty load on the disk itself (read, write lands in RAM, read, write lands in RAM, read, OhCrapGottaWrite'EmAll, read has to wait ... okay, read now)

brando56894 · Nov 26, 2015

Thanks for the info, I actually saw it around a max of 20-25 MB/sec a few times. Would it be better to store them in a dataset in my Multimedia pool?

HoneyBadger · Nov 26, 2015

brando56894 said:
Thanks for the info, I actually saw it around a max of 20-25 MB/sec a few times. Would it be better to store them in a dataset in my Multimedia pool?

Definitely. The workload on the single spindle will shift much closer to "sequential read" and your "Multimedia" pool can absorb txg writes much better as shown by your dd testing (~160MB/s)

brando56894 · Nov 26, 2015

HoneyBadger said:
Definitely. The workload on the single spindle will shift much closer to "sequential read" and your "Multimedia" pool can absorb txg writes much better as shown by your dd testing (~160MB/s)

Great thanks, I was always under the impression that doubling the writes (since it's a mirrored pool) would be worse than a single write but I didn't take into account multiple disks versus one disk.

Sent from my Nexus 6 using Tapatalk

brando56894 · Jan 13, 2016

Interesting results now that I cleaned up my pool a bit. I got 3 more 4 HGST HDDs (2 were DOA, waiting to get them back) and swapped out the 3 TB Green with the working 4 TB so now it's an equal mirror, along with adding an Intel S3700 SSD ZIL to the Multimedia pool and now the pool seems to be faster than my single Samsung 850 evo (which holds my jails)! This may be because I have too many drives on one controller (they seem to be bottlenecked on the Asrock C2750-D4I) so further testing may need to be done. These write tests were done on uncompressed datasets.

Code:

[root@freenas] /mnt/Multimedia/Downloads# dd if=/dev/zero of=test.img bs=2048k count=25k
50 GB transferred in 176.194304 secs (304.704  MB/sec)

[root@freenas] /mnt/Jails/test# dd if=/dev/zero of=test.img bs=2048k count=25k
50 GB bytes transferred in 363.439161 secs (147.72 MB/sec)

Sent from my Pixel C using Tapatalk

Important Announcement for the TrueNAS Community.

Low Disk Performance

brando56894

Wizard

Nick2253

Wizard

HoneyBadger

actually does care

brando56894

Wizard

Nick2253

Wizard

brando56894

Wizard

HoneyBadger

actually does care

brando56894

Wizard

brando56894

Wizard

brando56894

Wizard

HoneyBadger

actually does care

brando56894

Wizard

HoneyBadger

actually does care

brando56894

Wizard

HoneyBadger

actually does care

brando56894

Wizard

brando56894

Wizard

Similar threads

Important Announcement for the TrueNAS Community.

Low Disk Performance

Wizard

Wizard

actually does care

Wizard

Wizard

Wizard

actually does care

Wizard

Wizard

Wizard

actually does care

Wizard

actually does care

Wizard

actually does care

Wizard

Wizard

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Low Disk Performance"

Similar threads