Resource icon

SLOG benchmarking and finding the best SLOG

Ender117

Patron
Joined
Aug 20, 2018
Messages
219
I am testing the benefit of my P3700 as an slog and want to use a memory disk as a reference.
mdconfig -a -t swap -s 15g -u 1
Interestingly, it turns out the p3700 outperform the RAM disk. Out of curiosity I did a diskinfo -wS on the RAM disk
Code:
root@freenas:~ # diskinfo -wS /dev/md1
/dev/md1
		512			 # sectorsize
		16106127360	 # mediasize in bytes (15G)
		31457280		# mediasize in sectors
		0			   # stripesize
		0			   # stripeoffset

Synchronous random writes:
		 0.5 kbytes:	 21.7 usec/IO =	 22.5 Mbytes/s
		   1 kbytes:	 18.7 usec/IO =	 52.1 Mbytes/s
		   2 kbytes:	 23.3 usec/IO =	 84.0 Mbytes/s
		   4 kbytes:	 23.3 usec/IO =	167.8 Mbytes/s
		   8 kbytes:	 27.5 usec/IO =	284.5 Mbytes/s
		  16 kbytes:	 33.4 usec/IO =	467.9 Mbytes/s
		  32 kbytes:	 44.3 usec/IO =	705.3 Mbytes/s
		  64 kbytes:	 66.4 usec/IO =	941.4 Mbytes/s
		 128 kbytes:	145.2 usec/IO =	860.9 Mbytes/s
		 256 kbytes:	244.5 usec/IO =   1022.6 Mbytes/s
		 512 kbytes:	422.4 usec/IO =   1183.8 Mbytes/s
		1024 kbytes:	862.6 usec/IO =   1159.2 Mbytes/s
		2048 kbytes:   1691.4 usec/IO =   1182.5 Mbytes/s
		4096 kbytes:   3018.9 usec/IO =   1325.0 Mbytes/s
		8192 kbytes:   5555.3 usec/IO =   1440.1 Mbytes/s



Now while this should never be used in production, I was expecting the RAM disk to be at least an order of magnitude faster than any SSDs. Yet it turns out to be slower than my P3700 and quite a bit then the 900p/p4800x, etc. in this thread. Any thoughts?

FYI I am using 1866MHz RAMs if that matters.
 
Last edited:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
My guess is it’s not an NVMe disk ;)

But v interesting results and shows how fast these ssds are.

Nvme protocol was designed to minimize protocol overhead as the overhead was becoming significant in storage systems.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm not sure how it could be slower than NVMe. Both nvd and md plug into geom, so the upper layers are the same. There was some work on a shorter stack for NVMe devices, that bypassed GEOM (losing functionality but gaining speed), but that's an alternative to nvme/nvd.

I guess md needs a hostile takeover of its own...
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Well, it would only matter under memory pressure, but I guess the ARC might be causing that.
 

Ender117

Patron
Joined
Aug 20, 2018
Messages
219
Type = swap. Perhaps retest with type = malloc?
Code:
root@freenas:~ # mdconfig -a -t malloc -s 15g -o reserve
root@freenas:~ # diskinfo -wS /dev/md0
/dev/md0
		512			 # sectorsize
		16106127360	 # mediasize in bytes (15G)
		31457280		# mediasize in sectors
		0			   # stripesize
		0			   # stripeoffset

Synchronous random writes:
		 0.5 kbytes:	 20.5 usec/IO =	 23.9 Mbytes/s
		   1 kbytes:	 18.4 usec/IO =	 53.1 Mbytes/s
		   2 kbytes:	 18.7 usec/IO =	104.2 Mbytes/s
		   4 kbytes:	 23.7 usec/IO =	164.5 Mbytes/s
		   8 kbytes:	 25.7 usec/IO =	303.6 Mbytes/s
		  16 kbytes:	 35.7 usec/IO =	438.2 Mbytes/s
		  32 kbytes:	 41.5 usec/IO =	753.3 Mbytes/s
		  64 kbytes:	 68.7 usec/IO =	910.3 Mbytes/s
		 128 kbytes:	134.0 usec/IO =	933.0 Mbytes/s
		 256 kbytes:	231.2 usec/IO =   1081.5 Mbytes/s
		 512 kbytes:	419.2 usec/IO =   1192.9 Mbytes/s
		1024 kbytes:	920.3 usec/IO =   1086.6 Mbytes/s
		2048 kbytes:   1660.9 usec/IO =   1204.2 Mbytes/s
		4096 kbytes:   3054.5 usec/IO =   1309.6 Mbytes/s
		8192 kbytes:   6196.2 usec/IO =   1291.1 Mbytes/s



Almost the same. Guess there was some serious overhead in mdconfig as @Ericloewe suggested
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Curious what this shows

dd if=/dev/zero of=/dev/md1 bs=1M count=4000
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Curious what this shows

dd if=/dev/zero of=/dev/md1 bs=1M count=4000

FWIW, I get about 3.2 to 3.6GB/s with a 4 channel 2400mhz system running on Xeon E5-1650v4 (circa 4Ghz boost iiirc)

Guess its time to update it to 11.1 so that I can join the fun with diskinfo -wS

freenas-11.0U4 uptime.png
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
If you can reboot that after that many days of uptime, it may tell you no.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

Ender117

Patron
Joined
Aug 20, 2018
Messages
219
Curious what this shows

dd if=/dev/zero of=/dev/md1 bs=1M count=4000
It was close to 700MB/s, I believe this is related to the MAX IO was set to 128K in FreeBSD.

BTW on my pfsense (i3-7100, single 8G 2400MHz RAM) installation that was ~ 4.8GB/s. Either RAM technologies developed so much these years or something in QPI messed this up

And
Code:
[2.4.4-RELEASE][root@pfSense.localdomain]/root: mdconfig -a -t malloc -s 1g -o reserve
md1
[2.4.4-RELEASE][root@pfSense.localdomain]/root: diskinfo -wS /dev/md1
/dev/md1
		512			 # sectorsize
		1073741824	  # mediasize in bytes (1.0G)
		2097152		 # mediasize in sectors
		0			   # stripesize
		0			   # stripeoffset
		Yes			 # TRIM/UNMAP support
		Unknown		 # Rotation rate in RPM

Synchronous random writes:
		 0.5 kbytes:	  4.0 usec/IO =	121.2 Mbytes/s
		   1 kbytes:	  4.1 usec/IO =	238.9 Mbytes/s
		   2 kbytes:	  4.3 usec/IO =	459.5 Mbytes/s
		   4 kbytes:	  4.5 usec/IO =	872.1 Mbytes/s
		   8 kbytes:	  5.0 usec/IO =   1561.5 Mbytes/s
		  16 kbytes:	  6.2 usec/IO =   2500.2 Mbytes/s
		  32 kbytes:	  8.3 usec/IO =   3760.1 Mbytes/s
		  64 kbytes:	 12.8 usec/IO =   4891.0 Mbytes/s
		 128 kbytes:	 23.0 usec/IO =   5432.2 Mbytes/s
		 256 kbytes:	 40.5 usec/IO =   6171.3 Mbytes/s
		 512 kbytes:	 77.9 usec/IO =   6420.2 Mbytes/s
		1024 kbytes:	178.9 usec/IO =   5589.8 Mbytes/s
		2048 kbytes:	387.2 usec/IO =   5165.2 Mbytes/s
		4096 kbytes:	843.5 usec/IO =   4742.1 Mbytes/s
		8192 kbytes:   1676.5 usec/IO =   4772.0 Mbytes/s



More in line with what I expected.
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
something in QPI messed this up
Going with this one. I'm betting it gave you 15G of remote memory and you're having to hop NUMA nodes. Try with a smaller disk to see if you can get lucky and hit local memory.

Tried on an older single-socket BSD system and went beyond GB/s at 8K block size.
 

Ender117

Patron
Joined
Aug 20, 2018
Messages
219
Going with this one. I'm betting it gave you 15G of remote memory and you're having to hop NUMA nodes. Try with a smaller disk to see if you can get lucky and hit local memory.

Tried on an older single-socket BSD system and went beyond GB/s at 8K block size.
Tried as low as 512M and didn't help. I have almost 60G free RAM at the time. Guess mdconfig really does not take dual sockets systems into account
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112

Ender117

Patron
Joined
Aug 20, 2018
Messages
219
It may not. As an ugly hack for testing purposes you could try using cpuset to force your benchmark process (diskinfo/dd) to run there.

https://www.freebsd.org/cgi/man.cgi?query=cpuset&sektion=1&manpath=freebsd-release-ports
Code:
root@freenas:/mnt/m8 # cpuset -l 0 mdconfig -a -t malloc -s 1g -o reserve
md0
root@freenas:/mnt/m8 # cpuset -l 0 diskinfo -wS /dev/md0
/dev/md0
		512			 # sectorsize
		1073741824	  # mediasize in bytes (1.0G)
		2097152		 # mediasize in sectors
		0			   # stripesize
		0			   # stripeoffset

Synchronous random writes:
		 0.5 kbytes:	 20.2 usec/IO =	 24.2 Mbytes/s
		   1 kbytes:	 20.6 usec/IO =	 47.3 Mbytes/s
		   2 kbytes:	 21.5 usec/IO =	 90.9 Mbytes/s
		   4 kbytes:	 23.0 usec/IO =	169.9 Mbytes/s
		   8 kbytes:	 28.2 usec/IO =	277.0 Mbytes/s
		  16 kbytes:	 36.3 usec/IO =	430.4 Mbytes/s
		  32 kbytes:	 50.3 usec/IO =	621.8 Mbytes/s
		  64 kbytes:	 80.7 usec/IO =	774.6 Mbytes/s
		 128 kbytes:	140.6 usec/IO =	888.8 Mbytes/s
		 256 kbytes:	273.1 usec/IO =	915.4 Mbytes/s
		 512 kbytes:	510.8 usec/IO =	978.8 Mbytes/s
		1024 kbytes:   1020.7 usec/IO =	979.7 Mbytes/s
		2048 kbytes:   1836.3 usec/IO =   1089.2 Mbytes/s
		4096 kbytes:   3499.1 usec/IO =   1143.1 Mbytes/s
		8192 kbytes:   6490.1 usec/IO =   1232.6 Mbytes/s



Sounds interesting but didn't work. Or I might be using cpuset wrong
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Or I might be using cpuset wrong

With dual E5-2690v2's you've got forty threads, your example is targeting cpu0.

Try cpuset -l 20 diskinfo -wS /dev/md0 to bump yourself to the second socket (or -l 10 if you've got HT disabled)
 

Ender117

Patron
Joined
Aug 20, 2018
Messages
219
With dual E5-2690v2's you've got forty threads, your example is targeting cpu0.

Try cpuset -l 20 diskinfo -wS /dev/md0 to bump yourself to the second socket (or -l 10 if you've got HT disabled)
good point, though this appears not to be the problem
Code:
root@freenas:/mnt/m8 # cpuset -l 35 diskinfo -wS /dev/md0
/dev/md0
		512			 # sectorsize
		1073741824	  # mediasize in bytes (1.0G)
		2097152		 # mediasize in sectors
		0			   # stripesize
		0			   # stripeoffset

Synchronous random writes:
		 0.5 kbytes:	 20.4 usec/IO =	 23.9 Mbytes/s
		   1 kbytes:	 16.0 usec/IO =	 61.1 Mbytes/s
		   2 kbytes:	 21.9 usec/IO =	 89.3 Mbytes/s
		   4 kbytes:	 23.6 usec/IO =	165.2 Mbytes/s
		   8 kbytes:	 28.5 usec/IO =	274.6 Mbytes/s
		  16 kbytes:	 36.9 usec/IO =	423.9 Mbytes/s
		  32 kbytes:	 49.8 usec/IO =	627.7 Mbytes/s
		  64 kbytes:	 78.3 usec/IO =	798.0 Mbytes/s
		 128 kbytes:	137.5 usec/IO =	909.1 Mbytes/s
		 256 kbytes:	263.7 usec/IO =	948.1 Mbytes/s
		 512 kbytes:	490.7 usec/IO =   1018.9 Mbytes/s
		1024 kbytes:	990.8 usec/IO =   1009.3 Mbytes/s
		2048 kbytes:   1804.0 usec/IO =   1108.7 Mbytes/s
		4096 kbytes:   3276.0 usec/IO =   1221.0 Mbytes/s
		8192 kbytes:   6939.9 usec/IO =   1152.8 Mbytes/s

 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
dd if=/dev/shrug of=/this/post

Perhaps the md device isn't pinned to any specific pages of memory and is just merrily floating around between sockets, giving you a mix of local/remote access no matter where you're running.

If you really want to use RAM for your SLOG, get some NVDIMMs in your life. ;)
 

Ender117

Patron
Joined
Aug 20, 2018
Messages
219
dd if=/dev/shrug of=/this/post

Perhaps the md device isn't pinned to any specific pages of memory and is just merrily floating around between sockets, giving you a mix of local/remote access no matter where you're running.

Could be. I gave up on this idea and move onto testing the performance of the pool as a whole. https://forums.freenas.org/index.php?threads/performance-of-sync-writes.70470/ perhaps you can offer some insights?
If you really want to use RAM for your SLOG, get some NVDIMMs in your life. ;)
Great idea, I am open to donations, anyone?;)
 

kspare

Guru
Joined
Feb 19, 2015
Messages
508
Awesome thread! I'm in the midst of upgrading our NVMe devices. I just upgraded our 3 storage servers to 40gb nics and want to take advantage of that. I have some 900p optane 280gb pci nvme drives, but unsure how to format them for 4k vs 512. I've read you can't over provision them?

I'll get some numbers for an over provsioned intel 750 (400gb down to 100gb) and formatted for 4k this morning.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I have some 900p optane 280gb pci nvme drives, but unsure how to format them for 4k vs 512.

Optane 900p is considered a "consumer" drive and has a fixed 512b sector size. It's still screaming fast though.

I've read you can't over provision them?

Also seems to be correct; even with the P4800X users reported that attempting to use the isdct tool to set MaximumLBA fails.

I'll get some numbers for an over provsioned intel 750 (400gb down to 100gb) and formatted for 4k this morning.

I expect it to be quite close to the P3700 (although perhaps better as IIRC the ones benchmarked so far have been 512e rather than 4Kn) - looking forward to seeing those numbers!
 
Top