Help me explain this performance please!

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
Hi all,

I wanted to throw together a quick freenas system to use as iscsi shared storage for my ESXi hosts and to run plex/ sabnzbd/ sonarr off of it, build details are below (I know unsupported hardware, will be using it till I get enough cash together for a decent build!):
- cpu I5- 3570K
- 20gb ram
- mobo Asrock Z77 Extreme4
- SLOG 120GB SSD
- pool 4 3tb drives in mirrored vdevs, drives plugged directly into the sata connectors on the motherboard

If you need any omre build info please let me know!

Test results when ssh'd into the box:

root@NAS:~ # dd if=/dev/zero of=/mnt/storage/testfile2 bs=1048576
C4133245+0 records in
4133244+0 records out
4334020460544 bytes transferred in 860.767725 secs (5035063859 bytes/sec)
root@NAS:~ # dd of=/dev/zero if=/mnt/storage/testfile bs=4M count=1000000
100000+0 records in
100000+0 records out
419430400000 bytes transferred in 48.478730 secs (8651843727 bytes/sec)
root@NAS:~ # dd of=/dev/zero if=/mnt/storage/testfile2 bs=104857
^C11159633+0 records in
11159633+0 records out
1170165637481 bytes transferred in 407.339319 secs (2872704848 bytes/sec)
root@NAS:~ # dd if=/dev/random of=/mnt/storage/test.file bs=1M count=8192
8192+0 records in
8192+0 records out
8589934592 bytes transferred in 52.716621 secs (162945471 bytes/sec)
root@NAS:~ #

The last one from what I gather is random writes, from what I gather that is pretty bad right? I haven't yet tested random reads. Can anyone tell me how bad or not this performance is and what I can do to make it better?

Cheers
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
The kind of SLOG device makes a great deal of difference. SATA SSDs don't have the guts for this.
Look at the links in my signature, there's one or two down there that talk about the SLOG and the way to test performance.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
So the way you are testing isn't testing anything. First you need to disable compression on the dataset. This is why your write sites are so fast. Second when doing a read you need to use /dev/null for your 'of'. Third don't use random that is bottlenecked by your cpu and is not testing random writes, it's still streaming writes.

Lastly that slog is not going to work, unless it has power loss protection. You are better off not using it and seeing sync=disabled on the dataset.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
The /dev/random device is a CPU generated source of random numbers. This is most often used for cryptographic functions, and oddly enough creating a good source of random numbers is actually quite difficult. So the BSD kernel keeps a pool of random numbers, and periodically salts its entropy pool from random events that happen, like network interrupts, key stroke timings, etc... But in order to ensure the highest quality randomness for encryption use, this particular device blocks when the pool entropy gets depleted. On Linux systems, there's a /dev/urandom device that does not block, but provides lower quality randomness. That device exists on FreeNAS, but it's simply a symlink to /dev/random. The net result is you inadvertently did a CPU performance test that self-limits, not a Disk I/O test.

More info here: FreeBSD - RANDOM(4)
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
The kind of SLOG device makes a great deal of difference. SATA SSDs don't have the guts for this.
Look at the links in my signature, there's one or two down there that talk about the SLOG and the way to test performance.

Thanks for the response, I will have a look at the links and see what results I can generate :)
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
So the way you are testing isn't testing anything. First you need to disable compression on the dataset. This is why your write sites are so fast. Second when doing a read you need to use /dev/null for your 'of'. Third don't use random that is bottlenecked by your cpu and is not testing random writes, it's still streaming writes.

Lastly that slog is not going to work, unless it has power loss protection. You are better off not using it and seeing sync=disabled on the dataset.

Ah bugger.. Ok I will turn off compression and see what sort of results I can attain and change the syntax of the command.

I was under the impression that disabling sync is a no no if you value your data?
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
The /dev/random device is a CPU generated source of random numbers. This is most often used for cryptographic functions, and oddly enough creating a good source of random numbers is actually quite difficult. So the BSD kernel keeps a pool of random numbers, and periodically salts its entropy pool from random events that happen, like network interrupts, key stroke timings, etc... But in order to ensure the highest quality randomness for encryption use, this particular device blocks when the pool entropy gets depleted. On Linux systems, there's a /dev/urandom device that does not block, but provides lower quality randomness. That device exists on FreeNAS, but it's simply a symlink to /dev/random. The net result is you inadvertently did a CPU performance test that self-limits, not a Disk I/O test.

More info here: FreeBSD - RANDOM(4)

Cheers, I will adjust the testing and repost results. What sort of results shoud I expect for this kind of setup and are there any best practices or tweaks I can do aside from disabling sync to better performance?
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
I ended up buying an R510 with 2 xeons in it, 12 bays and had 32 gb ram which I upgraded to 56gb. My plan is to have an NFS pool with sync disabled for jails - plex, sonarr, sabnzbd (i've read that I should try and avoid using the plugins due to updates) and an iSCSI pool for my VMware environment.

The recommended slog device is like $1800 AUD, the runner up the optane 900p is about $600, so once I can afford it I may pull the trigger and pick one up. Could I use a Samsung 850 pro 1tb SSD in the mean time or would it do more harm then good? I also have the ability to put 2 - 3 in stripe in if it would help at all.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
My plan is to have an NFS pool with sync disabled

Why? Those O_SYNC writes are part of the NFS specification for a reason.

You need to decide what your write rate needs to be, and adopt a strategy of configuring the pool to meet the required specifications. Most disks can turn a specific number of read or write IOPS per second. Raidz pools will behave much differently than mirror pools, etc... Mirrors in general can issue I/O in parallel, which can give you a performance multiplier. Raidz vdev's generally have the write performance of a single component disk. I have a similar config (see build #3 in my sig), and I place the ESXi VM OS images in an iSCSI zvol via 10GbE, and only use NFS for shared access items, and/or shared read only items like iso images.
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
Hmm ok, I think I may have mixed two things together. What I read was to gain better performance when using NFS was to disable sync but that was for virtualization (at least I think that is what I read) and not media etc.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
Hmm ok, I think I may have mixed two things together. What I read was to gain better performance when using NFS was to disable sync but that was for virtualization (at least I think that is what I read) and not media etc.

Turning off sync for NFS will improve performance, but it also puts your data at risk. With ESXi, and VMFS over iSCSI, the Hypervisor has some knowledge, and can adjust as needed. Use iSCSI / VMFS for the critical stuff, and NFS for the data that needs to be shared with non-ESXi hosts.
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
Use iSCSI / VMFS for the critical stuff, and NFS for the data that needs to be shared with non-ESXi hosts
Yup, that's the plan. The NFS share will quite literally only be media files and jails for sab etc...

Any thoughts on me using the 850's either a single or striped for a slog or would that not be worth the time?
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
Any thoughts on me using the 850's either a single or striped for a slog or would that not be worth the time?

It will perform the function, but place your data at risk. It does not have Power Failure Protection, so it can loose in-flight writes in the event of a power failure, or unintended reset. I tried it with a little cheap Patriot SSD as well, it's an addictive performance bump, and it took some self-discipline to bring myself to remove it.

But... I have to point out... I haven't seen you mention anything about the networking you're using. If you're looking at turning off O_SYNC writes, and trying to hop up performance with a SLOG, and only using 1GbE networking, I have some bad news... Your 4 disk array will easily swamp a 1GbE network. I only noticed the SLOG in the pool because I have a 10GbE peer to peer (SFP+, no hub...) network to my ESXi box.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I tried it with a little cheap Patriot SSD as well, it's an addictive performance bump, and it took some self-discipline to bring myself to remove it.
Ideally if the device properly honors the request to "flush cached data to disk" then it shouldn't provide a performance bump unless it's truly able to fulfill it correctly. That's truly insidious if it's essentially NOOPing the flush and claiming "oh yeah your data is safe."

@Shankage - I like your idea of using NFS for the data repository for Plex/Sonarr/etc, because then you can somewhat safely set sync=disabled there, with the understanding that it's only your media files at risk. A separate dataset for metadata might be useful though, I'm not sure how well Plex would handle a corrupted DB/index.

For your SLOG, @rvassar is on the money with regards to 1Gbps networking being a bottleneck with any kind of modern NVMe SLOG. Even the tiny consumer-grade Optane 32GB is enough to give you full-speed sync writes (although endurance isn't the greatest) and that would hopefully be around the AUD$100 mark in conjunction with an M.2 NVMe to PCI adapter card.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
Ideally if the device properly honors the request to "flush cached data to disk" then it shouldn't provide a performance bump unless it's truly able to fulfill it correctly. That's truly insidious if it's essentially NOOPing the flush and claiming "oh yeah your data is safe."

I think a lot of the problem is the manufacturers don't offer any kind of certification for consumer grade SSD's that they handle that under all circumstances. They may handle unexpected reset, but not power failure, etc... So they get swept up into the "not supported" bucket. I suppose it actually worse to state "No", and then deliberately mishandle the cache flush as a marketing stunt.

Even the tiny consumer-grade Optane 32GB is enough to give you full-speed sync writes (although endurance isn't the greatest) and that would hopefully be around the AUD$100 mark in conjunction with an M.2 NVMe to PCI adapter card.

@HoneyBadger - Which one were you referring to? I just went looking at those. The little 16Gb Optane M10 is going for all of $35 on Amazon here in the US. I just checked the specs and Intel only rates it at 145MB/Sec., which isn't going to be much use on a 10GbE network. But the deal killer is "Enhanced Power Loss Data Protection: No".

Intel 16Gb M.2

Intel 32Gb M.2
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I think a lot of the problem is the manufacturers don't offer any kind of certification for consumer grade SSD's that they handle that under all circumstances. They may handle unexpected reset, but not power failure, etc... So they get swept up into the "not supported" bucket. I suppose it actually worse to state "No", and then deliberately mishandle the cache flush as a marketing stunt.

Yep. I wouldn't use a device with that kind of behavior in any scenario where I care about data integrity. I'd rather it honor it correctly and just be extremely slow when doing it.

@HoneyBadger - Which one were you referring to? I just went looking at those. The little 16Gb Optane M10 is going for all of $35 on Amazon here in the US. I just checked the specs and Intel only rates it at 145MB/Sec., which isn't going to be much use on a 10GbE network. But the deal killer is "Enhanced Power Loss Data Protection: No".

Intel 16Gb M.2

Intel 32Gb M.2

I'm assuming the OP has a 1GbE network, which only necessitates about a 115MB/s throughput (give or take) so that covers the speeds. Benchmarking from the SLOG thread shows that the 32GB one is able to handle about 200MB/s at 4KB recordsize (ashift=12) and the additional NAND will let it wear-level better. In regards to the lack of "Enhanced PLP" that's also missing on the Optane 280GB, but the nature of the Optane devices is that they are always direct-to-NAND writes; they have no volatile RAM area at all. As I understand Intel's statement, the P4800X gets some extra assurances, but in essence all writes to any Optane device are "power-fail safe."
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Could I use a Samsung 850 pro 1tb SSD in the mean time or would it do more harm then good? I also have the ability to put 2 - 3 in stripe in if it would help at all.
It isn't so much that it would be 'harm' as it would limit performance. SATA is a bottleneck.
I would point you at this very good video, which is about L2ARC, but if you watch the person doing the testing shows the limitation of speed for SATA SSDs before they move on to testing with Optane:
https://www.youtube.com/watch?v=oDbGj4YJXDw&t
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
Thanks for all the replies, after it was pointed out I realise that I would be wasting money as I am running gigabit.. question though, if i created the pool from drives laying around.. how much impact would it be to mix 5400 rpm drives with 7200's and sata 3 with sata 6?

Going forward, after I have migrated from my original setup (which I found VM performance a little sluggish, clicking around navigating etc) what else could I do to resolve this?

I should also note that previously that freenas box wasn't on all the time, so I guess it didn't have time to warm the cache, whereas I intend to leave this R510 on 24/7.
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
This is a bit off topic, but if I were to install esxi and virtualise freenas, I know I can present the pools to the local ESXi host. Is it possible for other ESXi hosts to still access these pools ?

I had a feeling as long as the network was accessible from the other hosts and they were allowed targets, it would be possible... is this a big no no?

Only reason I ask is because I would have liked to have a domain controller and vCenter running all the time to avoid the long boot up times and don't want to be running another box 24/7 as well.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
This is a bit off topic, but if I were to install esxi and virtualise freenas, I know I can present the pools to the local ESXi host. Is it possible for other ESXi hosts to still access these pools ?

It can be done, but... It's complicated.

I'll point you at this resource, which is quite old at this point... Do not run FreeNAS in production under ESXi.

I believe there is another resource covering this topic, but I can't seem to find it at the moment. But you're not asking about ground other haven't trekked across before. But... "There be Dragons..."
 

Shankage

Explorer
Joined
Jun 21, 2017
Messages
79
I have definitely virtualised freenas before, I get the ESXi configurations, strict resource settings and HBA on passthrough, was just trying to figure out if there were any issues presenting this storage to other ESXi hosts?

Only thing I can think of.. maybe any extra latency or delay as the requests would go through the the host requesting data to the switch, to the ESX host with freenas on it and then through the ESXi networking layer into the VM and then responds back the same way.
 
Top