Intermittent dipping in write speed over SMB

looney · Jan 30, 2017

Hi,

I have just created a new storage server with the following basic specifications:
CPU: 2x E5-2650v1
RAM: 192GB (24x8GB ECC DDR3 1600)
NIC: Intel X520DA-2
HBA: LSI SAS 9200-8e
JBOD: SC847 E16-RJBOD1
HDD: currently 27x 4TB ST4000DM000
FreeNAS: 9.10.2-U1

zpool (will be adding another raidz2 of 9 drives soon):

Code:

~ zpool status -v
  pool: FloppyD
state: ONLINE
  scan: scrub repaired 0 in 18h10m with 0 errors on Sun Jan 29 15:15:16 2017
config:

		NAME											STATE	 READ WRITE CKSUM
		FloppyD										 ONLINE	   0	 0	 0
		  raidz2-0									  ONLINE	   0	 0	 0
			gptid/93108990-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/97d47490-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9a98d29c-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/94808bdc-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9c2e2cf7-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/936061b7-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9f4a75ab-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a6c61557-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a1d94dd6-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
		  raidz2-1									  ONLINE	   0	 0	 0
			gptid/91e97661-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a39d4540-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/908760af-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a11c8a81-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9ee2ec62-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a618af0c-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a3b56e8f-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a5bca058-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a72cba93-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
		  raidz2-2									  ONLINE	   0	 0	 0
			gptid/a517fa6f-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a63ff419-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/95c7c648-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9e76827b-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9fa4abb1-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a2031236-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a4c14ba4-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a7c82eac-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a58aa746-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: none requested
config:

		NAME		STATE	 READ WRITE CKSUM
		freenas-boot  ONLINE	   0	 0	 0
		  da36p2	ONLINE	   0	 0	 0

So I have a data-set on that array using lz4 compression and i'm getting very promising results from DD:

Code:

~ dd if=/dev/zero of=testfile bs=1024k count=1000000
1000000+0 records in
1000000+0 records out
1048576000000 bytes transferred in 357.351477 secs (2934298772 bytes/sec)

If my math is correct that would mean it has just writen a terabyte of data at around 23.5 Gbps or 2.93 GB/s.
I can live with that :D

But of-course I want real life tests so I fired up a machine with RAID 1 SSD's that should be able to output a constant 600+ MB/s.
The machine is hooked up to the freenas via a 10G nic at network with which I can iperf to the freenas at 8Gbps.
After the DD and Iperf I was comfortable enough to do the first test so I started writing a 100GB testfile to the freenas over SMB.
It started at 300~ MB/s but after writing about 10GB it dropped down 20 MB/s and soon after it fell to 0 and stays there for about 30 seconds before going up to 300 again and then down to 0, up to 300, down to 0 etc. etc.
This does not happen when writing from the SSD test system to a SSD in my workstation over the same network.

Does anyone have any idea what could be causing this?
I looks like something cant keep up but that does not reveal itself when doing a 1TB DD or a 10 minute Iperf, also tried other sources that are know to output a steady 200+ MB/s but they all dipped straight down after a few seconds.

I am new to ZFS but I have done some research, for starters I have enabled auto tuning which helped in the Iperf department but not with the irregular transfer speeds.
Would I be correct in saying that SMB is async and therefore a ZIL/SLOG drive would not solve this issue?
There is probably something stupid I need to be reminded of but I can't seem to find the answer on the forum.
I have looked around on the forum but most posts with a comparable issue are running 20+TB arrays on 16 or less GB ram and as far as I know I'm well withing the recommended ram requirements, I know more it better but 2GB ram per TB should still be fine should it not?
Besides, a 1TB DD benchmark test should go past my ram at some point.

joeschmuck · Jan 31, 2017

I have no idea what is causing this, could be the JOBD enclosure?

But, a test for you, create a dataset with compression turned off and try the test again. Also do the DD test again with the compression turned off.

looney · Jan 31, 2017

joeschmuck said:
I have no idea what is causing this, could be the JOBD enclosure?

But, a test for you, create a dataset with compression turned off and try the test again. Also do the DD test again with the compression turned off.

If it where the JBOD then would it not also show during DD tests?

Here are some tests I could do during my lunchbreak with and without compression.

With compression:

Code:

~ dd if=/dev/zero of=testfile1 bs=1024k count=300000
300000+0 records in
300000+0 records out
314572800000 bytes transferred in 106.474770 secs (2954435119 bytes/sec)

2954 MB/s

Without compression:

Code:

~ dd if=/dev/zero of=testfile1 bs=1024k count=300000
300000+0 records in
300000+0 records out
314572800000 bytes transferred in 271.121671 secs (1160264316 bytes/sec)

1160 MB/s

I guess zero's are easy to compress :p

But still during these DD benchmarks there was no grinding to a halt during the test when I was looking at the file size increase at a even rate during the test. I did notice a increase in CPU util during the non compressed benchmark.

With SMB transfers I am still getting the same massive drops like I had with compression enabled.
PS, during SMB testing I test with both a large 100GB zip file with actual data and I also test with a 100GB zero file.
Both with the same results.

joeschmuck · Feb 1, 2017

With DD testing just being an internal test of raw hard drive thoughput, that really has no affect on the SMB issue, I just saw that you were using compression and to obtian true results you need to use an uncompressed dataset.

I actually was hoping SMB would improve a little bit without compression but that was grasping for straws. I know I've read about this type of issue before here in the forums and I don't recall what the issue was. What does the Reports -> Memory tab look like during the test? (screen capture would be fine). Also you could run top in an SSH terminal to see what is happening (another screen shot or two). The perplexing thing is, you state that you can transfer data to another work station without issue using SBM/CIFS protocol so it sounds like the issue is on the FreeNAS side of things.

Also, do you have Tunables (aka autotune) turned on? If you do not then turn it on. If it's aready on then another screenshot of the tunables that were genrated by autotune might be helpful.

looney · Feb 1, 2017

True, only reason I mentioned the DD was in relation to the JBOD remark as the DD tests point out the JBOD should be fine.

Here are some screenshots of the CPU and memory reporting during testing, all tests are to a uncompressed dataset.
First test around started around 17:52 was a 500GB DD, second test (started at 18:02, took about 15 minuts) was a 100GB rar copied over SMB and grinding to a halt all the time, and the third test was a 100GB zero test file over SMB which copied just fine without grinding to a halt a single time (335-350MB/s), this last test was started at 18:20.

freenas_-_FreeNAS-9.10.2-U1_%2886c7ef5%29_-_Google_Chr_2017-02-01_18-29-42.png

freenas_-_FreeNAS-9.10.2-U1_%2886c7ef5%29_-_Google_Chr_2017-02-01_18-30-01.png

Here you can see the NIC throughput during a SMB copy of the test rar file client side, so thats roughly 3,8 gigabyte chunks.

MEDIA_-_media-server_-_Remote_Desktop_Connection_2017-02-01_18-11-09.png

Here are the current tunables:

freenas_-_FreeNAS-9.10.2-U1_%2886c7ef5%29_-_Google_Chr_2017-02-01_17-59-36.png

So some weird stuff going on here, also when copying files to the pool at a slower speed, no more than 90MB/s, the transfers don't halt but if I start another transfer from a second client at the same slow speed both will halt and continue like the rar test.
This is one of the reasons I'm expecting freenas to be the issue as its not client specific.

PS, I added 9 more drives yesterday so these tests where made on the following pool:

Code:

		NAME											STATE	 READ WRITE CKSUM
		FloppyD										 ONLINE	   0	 0	 0
		  raidz2-0									  ONLINE	   0	 0	 0
			gptid/93108990-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/97d47490-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9a98d29c-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/94808bdc-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9c2e2cf7-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/936061b7-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9f4a75ab-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a6c61557-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a1d94dd6-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
		  raidz2-1									  ONLINE	   0	 0	 0
			gptid/91e97661-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a39d4540-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/908760af-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a11c8a81-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9ee2ec62-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a618af0c-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a3b56e8f-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a5bca058-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a72cba93-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
		  raidz2-2									  ONLINE	   0	 0	 0
			gptid/a517fa6f-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a63ff419-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/95c7c648-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9e76827b-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/9fa4abb1-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a2031236-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a4c14ba4-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a7c82eac-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
			gptid/a58aa746-e48a-11e6-b1e3-0025907e82ce  ONLINE	   0	 0	 0
		  raidz2-3									  ONLINE	   0	 0	 0
			gptid/eb182455-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/ec8d32b0-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/edc1526f-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/eef20607-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/f002450f-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/f123b618-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/f270ac82-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/f3a227a9-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0
			gptid/f4ba3d44-e802-11e6-839a-0025907e82ce  ONLINE	   0	 0	 0

joeschmuck · Feb 2, 2017

During the test have you connected the FreeNAS and your data computer directly together and tried the test? Basically taking out all of the hardware in the middle.

Also have you read any threads about tuning for 10Gb NICs? Here is one to look at and I'm certain there are a few more. Right now I think the problem may be fixed using a few tunable tweaks.

SweetAndLow · Feb 2, 2017

Have you configured lacp or lagg? I don't think I have any suggestions but that information might be helpful. I vote to simplify your network setup with no switch and a single 10gig connection between client and server.

looney · Feb 5, 2017

Sorry for the late response,

I have also tested with direct connections taking my arista switch out of the equation but this did not solve the problem.
I also have no lagg or lacp configured and have gone over the tunebles once more but this also seemed to match the tunebles mentioned in most topics.

I have also done a remote test with a linux VM using DD on the client side writing to a non compressed dataset (once NFS and once SMB) and that seemed to have no issues.
I'm still in the process of data-migration and I'm using the current test client to do that (not while running tests).
When that data-migration is done I can install linux baremetal on that server to test throughput from its SSD array to the freenas SMB and NFS shares. At the moment my other servers only run small linux VM's that wont be able to provide the right amount of data to make a good test.

But it does seem like windows is being the problem here, surprise surprise, if that's the case then I can live with that as I should only have one windows client in my environment anyway.

Ofcourse suggestions to fix it are still welcome but once I have confirmed it's a windows thing then I personally don't have a issue anymore.

Some info on the client if that's helpful:
Supermicro 6027TR-DTRF
128GB 1866MHz DDR3 ECC
2x E5-2660V2
2x 850 Pro 512GB (RAID 1)
Dual port intel 82599 NIC
Windows server 2012 R2

The non direct tests where done over a Arista 7124SX.

looney · Feb 16, 2017

Ok, so as it turns out it wasnt windows being the problem.
Testing from bare-metal ubuntu on the test cliënt I am still getting massive dips/halts on write speeds. they go up to 1GB/s (8000 Mbps) and then back to 10 or so KB/s.
It does this also with NFS, SMB and iSCSI, I haven't tested other protocols.

I honestly have no idea at this point and this is driving me nuts, so far I didn't need stable write speeds, but in a week I will need to write 500-1000 Mbps non stop for 72 hours, I can still quickly build a hardware raid system but I was hoping to use this one.

Latest Zpool config, I have added mirrored 3610 as SLOG:

Code:

  pool: FloppyD
state: ONLINE
  scan: scrub repaired 0 in 18h10m with 0 errors on Sun Jan 29 15:15:16 2017
config:

        NAME                                            STATE     READ WRITE CKSUM
        FloppyD                                         ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/93108990-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/97d47490-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/9a98d29c-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/94808bdc-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/9c2e2cf7-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/936061b7-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/9f4a75ab-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a6c61557-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a1d94dd6-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
          raidz2-1                                      ONLINE       0     0     0
            gptid/91e97661-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a39d4540-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/908760af-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a11c8a81-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/9ee2ec62-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a618af0c-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a3b56e8f-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a5bca058-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a72cba93-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
          raidz2-2                                      ONLINE       0     0     0
            gptid/a517fa6f-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a63ff419-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/95c7c648-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/9e76827b-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/9fa4abb1-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a2031236-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a4c14ba4-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a7c82eac-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
            gptid/a58aa746-e48a-11e6-b1e3-0025907e82ce  ONLINE       0     0     0
          raidz2-3                                      ONLINE       0     0     0
            gptid/eb182455-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/ec8d32b0-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/edc1526f-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/eef20607-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/f002450f-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/f123b618-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/f270ac82-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/f3a227a9-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
            gptid/f4ba3d44-e802-11e6-839a-0025907e82ce  ONLINE       0     0     0
          raidz2-5                                      ONLINE       0     0     0
            gptid/6aabafbe-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/6b784ec3-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/6c4eda10-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/6d1e48c6-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/6de63bf2-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/6eb83a2d-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/6f85f888-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/705afb37-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/71f29dae-f056-11e6-a043-0025907e82ce  ONLINE       0     0     0
        logs
          mirror-4                                      ONLINE       0     0     0
            gptid/ac395a9a-f04f-11e6-a043-0025907e82ce  ONLINE       0     0     0
            gptid/acd503c6-f04f-11e6-a043-0025907e82ce  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da37p2    ONLINE       0     0     0

errors: No known data errors

bigphil · Feb 16, 2017

paste the output of the following command. "sysctl -a | grep dev.ix.0.%desc"

I have a feeling this is because of a current issue in the ixgbe driver thats loaded in FreeNAS 9.10. If the output from the above command shows 3.1.13-k. Information is in this thread. Apparently this is fixed in the 3.1.14 driver, but FreeNAS 9.10 doesn't currently have it. With the latest nightly it might as they've moved to FreeBSD 11! So...you might trying rebuilding with the latest FreeNAS 9.10 nightly or 10...just to prove if that is it or not. Also...have you tried disabling all of your tunables, rebooting and seeing if its better?

looney · Feb 16, 2017

bigphil said:
paste the output of the following command. "sysctl -a | grep dev.ix.0.%desc"

I have a feeling this is because of a current issue in the ixgbe driver thats loaded in FreeNAS 9.10. If the output from the above command shows 3.1.13-k. Information is in this thread. Apparently this is fixed in the 3.1.14 driver, but FreeNAS 9.10 doesn't currently have it. With the latest nightly it might as they've moved to FreeBSD 11! So...you might trying rebuilding with the latest FreeNAS 9.10 nightly or 10...just to prove it that is it. Also...have you tried disabling all of your tunables, rebooting and seeing if its better?

The nic is indeed running 3.1.13-k, I will look at the topic you linked though I should point out that this problem also happens on the 1GB nics.
I will remove all tunables now and reboot the system.
-----
Done, same issue dropping down to 0 with no tunables, NFS write over 1Gbit nic.

I have been monitoring iostat and I noticed a spike in write bandwith evertytime before the drop to 0.
Thought this might be helpful.

Code:

zpool iostat FloppyD 1
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----

FloppyD     68.0T  94.5T      0    411      0  2.42M
FloppyD     68.0T  94.5T      0    605      0  13.8M
FloppyD     68.0T  94.5T      0    248      0  1.46M
FloppyD     68.0T  94.5T      0    391      0  2.32M
FloppyD     68.0T  94.5T      0    436      0  2.64M
FloppyD     68.0T  94.5T      0    445      0  2.65M
FloppyD     68.0T  94.5T      0    411      0  2.41M
FloppyD     68.0T  94.5T      0    419      0  2.49M
FloppyD     68.0T  94.5T      0    238      0  1.47M
FloppyD     68.0T  94.5T      0    428      0  2.52M
FloppyD     68.0T  94.5T      0    613      0  14.0M
FloppyD     68.0T  94.5T      0    455      0  2.68M
FloppyD     68.0T  94.5T      0    535      0  42.8M
FloppyD     68.0T  94.5T      0      0      0      0
FloppyD     68.0T  94.5T      0      0      0      0
FloppyD     68.0T  94.5T      0    605      0  13.0M
FloppyD     68.0T  94.5T      0    419      0  2.51M
FloppyD     68.0T  94.5T      0    430      0  2.63M
FloppyD     68.0T  94.5T      0    399      0  2.35M
FloppyD     68.0T  94.5T      0    446      0  2.64M
FloppyD     68.0T  94.5T      0    217      0  1.31M
FloppyD     68.0T  94.5T      0    608      0  12.7M
FloppyD     68.0T  94.5T      0    401      0  2.28M
FloppyD     68.0T  94.5T      0    442      0  2.60M
FloppyD     68.0T  94.5T      0    225      0  1.32M
FloppyD     68.0T  94.5T      0    419      0  2.45M
FloppyD     68.0T  94.5T      0    813      0  14.1M
FloppyD     68.0T  94.5T      0    425      0  2.40M
FloppyD     68.0T  94.5T      0    456      0  2.74M
FloppyD     68.0T  94.5T      0    446      0  2.63M
FloppyD     68.0T  94.5T      0    464      0  2.73M
FloppyD     68.0T  94.5T      0    225      0  1.30M
FloppyD     68.0T  94.5T      0    444      0  28.9M
FloppyD     68.0T  94.5T      0      0      0      0
FloppyD     68.0T  94.5T      0    219      0  1.26M
FloppyD     68.0T  94.5T      0    456      0  2.71M
FloppyD     68.0T  94.5T      0    439      0  2.56M
FloppyD     68.0T  94.5T      0    465      0  2.75M
FloppyD     68.0T  94.5T      0    236      0  1.39M
FloppyD     68.0T  94.5T      0    437      0  2.58M
FloppyD     68.0T  94.5T      0    415      0  2.45M
FloppyD     68.0T  94.5T      0    637      0  14.4M
FloppyD     68.0T  94.5T      0    427      0  2.51M

Sakuru · Feb 16, 2017

It looks like you enabled autotune. I've heard that's actually for systems that don't have enough RAM. I highly recommend turning it off, deleting all the tunables it created, and rebooting.

looney · Feb 17, 2017

Sakuru said:
It looks like you enabled autotune. I've heard that's actually for systems that don't have enough RAM. I highly recommend turning it off, deleting all the tunables it created, and rebooting.

I deleted all tunables and rebooted yesterday, no difference.

I should also point out the the zpool was originally created in freenas 10, could that be the issue?

joeschmuck · Feb 18, 2017

looney said:
I should also point out the the zpool was originally created in freenas 10, could that be the issue?

It should have nothing to do with it at all. Although I would not have ever used FreeNAS 10 as my starting point for creating anything except a test platform and these if I were to use it for real data, I'd have wiped it all and started from scratch. Again, I don't see how this could be the problem. Guess you could test it out if you have a lot of time on your hands. I suspect it's a NIC driver issue at this point.

looney · Feb 19, 2017

But if its a nic driver issue then why does it also happen on a completely different on-board 1Gbit nic?

joeschmuck · Feb 19, 2017

looney said:
But if its a nic driver issue then why does it also happen on a completely different on-board 1Gbit nic?

I must have missed that statement.

Have you tried to use something else besides the JBOD enclosure? Possibly a test using a single hard drive directly connected to the motherboard as your storage, then check out how things work with some sample data. The idea here is to rule out parts of your hardware. If things work perfectly without the JBOD enclosure and the HBA, next install the HBA and try one drive connected to it. You get the idea.

bigphil · Feb 19, 2017

check "netstat -m" during a one of the write tests that cause the dips and see if there are any denied/delayed mbuf requests.

Stux · Feb 19, 2017

If ZFS has two transaction groups pending it will freeze until the first Is written. Your transaction group size should be sufficient that it can buffer 5 seconds of full speed network traffic.

And then your disk speed is almost equal to your network speed. Thus if the network outstrips the disk it's eventually going to stop the network and wait for the disks to flush.

Also, how is the jbod connected? Just wondering if your drives are bottlenecked to 12gbps because of wiring...

looney · Feb 20, 2017

From now on I'm only testing on 1Gbit nic, I cant do a lot of testing right now though.

The JBOD backplanes have one uplink each as they are the EL1 model so only one uplink and one cascade link AFAIK.

In a weeks time I can start testing again, I will then post the netstat results and I will test to a single drive taking the HBA and JBOD out of the equation.
I would find it odd if the JBOD wiring/backplane is the culprit here, it never let me down when I was running hardware RAID so it should be able to handle the throughput rates I believe.
With hardware RAID I was also easily writing at 4gbit so the JBOD should have no trouble pushing 1Gbit I think.

joeschmuck · Feb 20, 2017

I only request you test without the JBOD enclosure to rule out the hardware as the issue.

Important Announcement for the TrueNAS Community.

Intermittent dipping in write speed over SMB

Dabbler

Old Man

Dabbler

Old Man

Dabbler

Old Man

Sweet'NASty

Dabbler

Dabbler

Patron

Dabbler

Guru

Dabbler

Old Man

Dabbler

Old Man

Patron

MVP

Dabbler

Old Man

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Intermittent dipping in write speed over SMB"

Similar threads