SLOG Underprovisioning

Status
Not open for further replies.

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Splitting this off from Steven Sedory's thread about Hyper-V performance so as not to further hijack it on a tangent.

jgreco said:
I'm convinced that underprovisioning the SLOG devices is the way to go, simply because you're *guaranteeing* that the controller has a much larger bucket of free pages to work with. I suggested this years ago

https://bugs.freenas.org/issues/2365

but no one's interested in proving or disproving the theory.

I've got a spare box, some extra SSDs, and I'll be posting some findings in this thread as I can get them.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Copied over from the other thread:

---

I'm not sure if I'm doing it right, but so far the iozone testing hasn't shown any appreciable difference between an undersized partition or a full-sized one during "general use."

I also set it up as an SLOG device against a pool and saw some interesting results after copying a huge volume of data. Excuse the mangled graph, but it wouldn't all sit on one.

ExKL7GD.png


After a long, steady run at ~40MB/s it takes a hard nosedive. I'm guessing the sustained writes were too much for it, and the writes managed to outpace TRIM. I cut it off after observing that for a bit.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Basically there's three potential options for configuration of the SLOG device.

1. Assign the full size of the drive to SLOG via GUI. (This is the current situation.)

2. Manually create a smaller partition via CLI commands and attach it.

3. Use the ATA SET MAX ADDRESS command to limit the visible size of the drive to a smaller size, then assign the "full size" via GUI.

In a world where TRIM is infallible and instant, these options should all be the same in terms of both performance and wear leveling. The working theory under investigation here is that "one of these things is not like the other ones."

Currently using the following command:

Code:
iozone -r 4k -s 48G -i 0 -i 2 -o -f /mnt/testvol/iozone.file > /mnt/testvol/iozone-test-results.txt


RAM in the test system is only 24GB so the 48GB test file should be more than big enough to blow out any attempt at caching I/O; but we're not interested in read tests anyhow, they just kind of happen as a side effect of -i 2.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Correct so far. My theory is that a SLOG device that is underprovisioned by a large amount may sustain the burst traffic rate for a longer period, or (ideally) maybe even indefinitely. If so, that's a win.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I hate to break it to you, but if iozone was only giving you 40MB/sec, either you are doing something wrong, or that SSD is not designed to act as an slog. When I used to do slog tests for the purpose of slog benchmarks, I got over 100MB/sec on old SATA-II SSDs.

So what iozone command did you use and what SSD are you using?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I hate to break it to you, but if iozone was only giving you 40MB/sec, either you are doing something wrong, or that SSD is not designed to act as an slog. When I used to do slog tests for the purpose of slog benchmarks, I got over 100MB/sec on old SATA-II SSDs.

Whatever are you talking about? That's almost bang on for the speed I get out of an Intel 320 SSD on an HP Microserver N36L acting as SLOG sucking it down via NFS from several hypervisors at once. Throughput to the SLOG is highly dependent on a ton of variables, including the speed of the host platform, the size of the SSD, the level of pressure (simultaneous access etc) on the pool, bla bla...
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
That's almost bang on for the speed I get out of an Intel 320 SSD

By no manner of coincidence that's the current device under load. 80GB model to be exact. And while it might not have been "designed for use as an SLOG" it was the de facto "cheap SLOG device" for some time until it was supplanted by newer options. Intel? Check. Power-safe? Check. Decent enough performance? Check.

The iozone command is above but I'll repeat it here:

Code:
iozone -r 4k -s 48G -i 0 -i 2 -o -f /mnt/testvol/iozone.file > /mnt/testvol/iozone-test-results.txt


Right now this is just local "taste testing" - when I get more time I'm going to simulate remote load via NFS and iSCSI if the SSD hasn't melted.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I was able to use "camcontrol" from within FreeNAS to set a host protected area on my 200GB Intel 3700 SSD on my freenas1 (Specs below). It now appears as a 16 GB device.
https://forums.freenas.org/index.ph...nd-transaction-group.35470/page-2#post-216327

So would it help if I ran an iozone command modified for my system to see if anything happens after 16GB and again after 200GB?

Specs for those of you on mobile devices:
freenas1 Specs: 9.3-RELEASE| Xeon E5-2637V3 | 128GB DDR4 2133MHz ECC RDIMM | X10SRH-CLN4F | On-board LSI 3008 (Flashed v8-IT) | 12x 4TB WD-Re SAS (striped mirrors) | 2x Intel S3700 (SLOG and L2ARC) | SuperMicro SuperStorage Server 5028R-E1CR12L
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Nicely done @depasseg ...

I'd definitely be interested in seeing the results from a modern SSD. The problem is I don't know if a contemporary ZFS system with TRIM support can be emulated to a reasonable level; I hate looking at all the damn switches in iozone! :smile: I know that for systems that don't support TRIM, proper underprovisioning is either not-harmful or is a win, but creating a decent test that simulates a TRIM supported system, think that might require some fancier test jig.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Thanks!

Yeah I was a little confused about that too. I am tempted to run with -a (all tests instead of the 2 "-i" tests) and -s 300G. I'm open to suggestions though.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Nice find @depasseg - didn't realize that camcontrol would expose the necessary functionality. No way around the drive relocking/one-HPA-command-per-power-cycle, but at least it's within FreeNAS.

For the tests, I understand that -i 0 is required to establish the initial file, and -i 2 is the random write/rewrite. But wouldn't SLOG write patterns be more sequential in nature? Yes, they're still small-block, single queue depth, but if it's just "accept block, write to SLOG" wouldn't that be sequential? In which case you could emulate it with just a standard dd command (assuming you disable compression on the target dataset) with bs=4k ... but that seems too simple so it's probably incorrect.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
SLOG writes are supposed to be sequential, yes. I've been thinking that to emulate the TRIM aspect it might be useful to build a TRIM-enabled UFS filesystem on the SSD and then do a little shell scripting to simulate SLOG-like behaviour. I could probably step up to write the shell scripting if someone is willing to have me step them through the necessary steps.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
it is also hard to sustain a consistent level of saturation/pressure on the device.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
it is also hard to sustain a consistent level of saturation/pressure on the device.

Hence my harebrained thought of "compression=off, sync=always, dd if=/dev/zero of=/mnt/testvol/bigfile bs=4k count=ReallyHighNumber"

Looking at the disk utilization graphs, it generates roughly the same performance and throughput numbers as the iozone "write" test.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
But what does it do to simulate TRIM?

If you attach the SSD as an SLOG you don't need to simulate TRIM, it's active.

Code:
[root@zfstest] /mnt/testvol# sysctl -a | grep kstat.zfs.misc.zio_trim
kstat.zfs.misc.zio_trim.bytes: 457011159040
kstat.zfs.misc.zio_trim.success: 4737134
kstat.zfs.misc.zio_trim.unsupported: 667
kstat.zfs.misc.zio_trim.failed: 0
[root@zfstest] /mnt/testvol#


I've already TRIMmed ~425GB of this poor abused 80GB drive, so it's seen 5 drive writes since it was put into use a day and a half ago.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So I think someone should write a BSD program for setting the max sata address so we can add it to FreeNAS. :D

Any takers?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Status
Not open for further replies.
Top