SOLVED SLOG Write Speeds

Status
Not open for further replies.

Donny Davis

Contributor
Joined
Jul 31, 2015
Messages
139
I am hoping someone can help me understand how to extract the absolute best performance possible from my system. The purpose of the system is to function as my primary storage array for NFS based home shares. The one I have is ageing, and needs to be upgraded. After careful thought and consideration on what the system needs to perform its absolute best for its purpose, I had to custom build something that does exactly what i need it to at light speeds. So comes to my question.
All test are performed locally, NFS is not yet in the equation. I have 12 drives in 2 RAIDz2 (6x6)

Why when my PCIE NVME drive performs as a SLOG, does it not write as fast as when it functions as a normal drive.

To give an example, here is what the SLOG device looks when used as a drive in its own pool. Which yields performance in line with the rest of the setup(network).
Code:
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=64
fio-3.0
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][93.8%][r=0KiB/s,w=417MiB/s][r=0,w=107k IOPS][eta 00m:02s]
test: (groupid=0, jobs=1): err= 0: pid=14342: Sun Sep 23 14:47:04 2018
  write: IOPS=34.5k, BW=135MiB/s (141MB/s)(4096MiB/30413msec)
   bw (  KiB/s): min=54818, max=416429, per=95.60%, avg=131839.60, stdev=80313.94, samples=60
   iops		: min=13704, max=104109, avg=32959.57, stdev=20078.63, samples=60
  cpu		  : usr=3.02%, sys=37.72%, ctx=334582, majf=0, minf=0
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=0,1048576,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=135MiB/s (141MB/s), 135MiB/s-135MiB/s (141MB/s-141MB/s), io=4096MiB (4295MB), run=30413-30413msec


However when its placed in the larger pool full of spinny rust, this is what fio has to say
Code:
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=64
fio-3.0
Starting 1 process
Jobs: 1 (f=1): [w(1)][90.9%][r=0KiB/s,w=209MiB/s][r=0,w=53.5k IOPS][eta 00m:10s]
test: (groupid=0, jobs=1): err= 0: pid=17178: Sun Sep 23 14:59:00 2018
  write: IOPS=10.6k, BW=41.2MiB/s (43.2MB/s)(4096MiB/99310msec)
   bw (  KiB/s): min= 1872, max=488836, per=92.93%, avg=39248.60, stdev=40088.66, samples=198
   iops		: min=  468, max=122209, avg=9811.83, stdev=10022.19, samples=198
  cpu		  : usr=1.07%, sys=9.28%, ctx=829506, majf=0, minf=0
  IO depths	: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
	 submit	: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
	 issued rwt: total=0,1048576,0, short=0,0,0, dropped=0,0,0
	 latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=41.2MiB/s (43.2MB/s), 41.2MiB/s-41.2MiB/s (43.2MB/s-43.2MB/s), io=4096MiB (4295MB), run=99310-99310msec

I am pretty sure I am misunderstanding how a SLOG is used, or how to configure things to do what I am looking for.

What I want to do is send all writes to my SLOG, which in turn should give me the performance of the raw device (nearly).

Its entirely possible I have this all wrong, and if that is the case, please help me understand.

Thanks
~Donny
 
Last edited:

Donny Davis

Contributor
Joined
Jul 31, 2015
Messages
139
Ok, I see what is going on.

When I pull my SLOG from the pool and set the writes to sync=always, performance goes to nearly nothing. (100 iops at 4K blocks)

It would seem the SLOG is doing what it is intended to do, and going as fast as it can.

If there are some tunables that can be changed to increase perfomance, I would be quite happy if someone was to point me in the right direction.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Why when my PCIE NVME drive performs as a SLOG, does it not write as fast as when it functions as a normal drive.
Because the write is going to disk, the NVMe drive is just a log (Separate Log) SLOG device. Normal operation without the SLOG has the write go to the pool, in a temporary log space, then be written again with the transaction group to permanent storage. By offloading that extra write from the pool onto a separate device, the sync can be acknowledged before it is committed to disk, because it is in the SLOG, and the remote system doesn't need to wait.
I am pretty sure I am misunderstanding how a SLOG is used,
Most people do.
What I want to do is send all writes to my SLOG, which in turn should give me the performance of the raw device (nearly).
It just doesn't work that way.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
If there are some tunables that can be changed to increase perfomance, I would be quite happy if someone was to point me in the right direction.
More detail about your hardware are needed.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419

Donny Davis

Contributor
Joined
Jul 31, 2015
Messages
139
I should really update my sig, but I have too many servers to list. (29 to be exact)

The SLOG is an Intel P3506 (Oracle zfs cache drive).

I updated the post as solved because I went back an read what @jgreco had taken the time to explain to me before. I am clear on the purpose in what a SLOG does now

Thanks for jumping on this @Chris Moore @Stux and @jgreco
 

Donny Davis

Contributor
Joined
Jul 31, 2015
Messages
139

Donny Davis

Contributor
Joined
Jul 31, 2015
Messages
139
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I updated the post as solved because I went back an read what @jgreco had taken the time to explain to me before. I am clear on the purpose in what a SLOG does now

Well, it's a continual learning process. ZFS is very complicated if you're coming to it from more traditional UNIX filesystems. Expect to have to read things multiple times, over time... I do! :smile:
 

Donny Davis

Contributor
Joined
Jul 31, 2015
Messages
139
29 servers? Nice.

If you're looking for full-throttle performance, you'll want to check the link in my signature where other users have benchmarked their SLOG devices and posted results:

https://forums.freenas.org/index.php?threads/slog-benchmarking-and-finding-the-best-slog.63521/

Right now the fastest "generally available" device is an Optane 900p/P4800X NVMe SSD.


I have two Optane 900p drives right now, and while fast they have no PLP otherwise I would use them for the SLOG. I cannot afford a DC4800, or I would have one.

The DC3605 yields some decent numbers, and my workload is pretty low on this box. It's only purpose is NFS home shares. No apps, no containers, no vms, and no jails... Just NFS.

I was considering turning off HT. Mainly because of the single focus for this box, and most of the ops that are done on it are single threaded. Even If I can get 10% more, it seems worth it.


As for the 29 servers, I have an Openstack cloud that I use for various purposes. Openstack is 16 of the 29, and Ceph is 6. The other 7 make up boxes like this one, and a small virt environment to handle my day to day vm's, edge router and other what not.

This box has been custom built to fit in one of those short racks, and its been a journey... I think I have it about all shored up and everything seems to be running well.
 
Status
Not open for further replies.
Top