Hardware RAID controller as a SLOG device?

Status
Not open for further replies.

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
Hi again!

I’m back on the track of implementing FreeNAS as my primary storage for my lab environment.
The previous thread where we discussed my setup and the possible solutions that I wanted to implement is at this location: https://forums.freenas.org/index.ph...c-h710p-mini-raid-settings.49931/#post-344102
For my FreeNAS server I will have a Dell PowerEdge R720 with a backplane that supports 16 x 2.5” disks.
The main workload that will be running against this NAS is my two ESXi servers. Hence the plan is to use either NFS or iSCSI to connect to the datastores.
I have bought a new “LSI SAS 9207-8i SATA/SAS 6Gb/s PCI-E Host Bus Adapter” that I will connect to the backplane.
The plan is to run with 14 x 600 GB SAS 10000 RPM drives that I will configure as mirrored vdev’s.
Since most the writes will need to be sync=always I will need to have a SLOG to offload the datastore.
This is my lab environment so I don’t want spend too much money right from the start. Hence, I would like to try out the following configuration.
Since my server already have a battery powered PERC H710 RAID (512 GB Cache) controller in it. I would like to use 2x146 GB SAS 15000 RPM in a RAID 1 as the SLOG device.
I know that this is not a popular thing to do but I would at least want to try this out and see if this gives me a performance boost or not.
Worst case is that it doesn’t give me a performance boost and I will later need to buy a proper SSD to use as the SLOG device.
But I wanted to check in with you guys if you think I should even try this out or just skip the SLOG device and have the ZIL on the default location on the datastore?
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
Using any kind of hardware RAID device prevents the drives from being properly managed by FreeNAS and hides important health information (SMART) from the OS. You are advised not to use any hardware RAID controllers.
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
I do understand that but since this will only be used as SLOG device and that my OpenMange will inform me if there is anything wrong with the disks, should this not then be a viable "cheap" option for me to use? I want to be able to use the caching feature of the H710 with the speed of the RAID1 disks attached to. And in my mind that should be an OK solution. If this is still a no and I don't want to buy an enterprise SSD to use a SLOG would option 1 or 2 be best?

1. Run my 2 x 146 SAS 15000RPM disks a mirrored vdev on my LSI HBA and use a SLOG (no caching functionality)?

2. Skip running SLOG and hope that the performance will be enough for my lab environment?

Keep in mind that I'm currently running on a REALLY old PowerEdge 2950 with 6 x 300 GB 7200 RPM disk in a RAID 5 (shared through Windows 2012 R2 as iSCSI). So I basically just need to beat that performance to be happy as a clam ;):D
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
If you try this, please report back. I'd guess performance improvements with a spinning-disk SLOG would be marginal if not actually negative.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
IIRC, @jgreco did something in this vein. He used a damned good RAID card with a large write cache and battery backup - the spinning rust was just there so that the RAID controller would play along.

This depends on cooperation by the RAID controller, though. The moment SLOG has to wait for disk, performance is going to tank. I doubt it'd be worse (unless the pool is really good or the other drive is really slow), but it'll be painful.
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
Ok. Point taken. Let me run this solution by you then.
What if I buy two simple consumer grade SSD and connect to the H710 RAID controller. Something like KingFast 32GB Solid State Drives 2.5" SATA 3.0 SSD for Laptop Desktop F2 Hot
Then I would have the cache + battery backup from the RAID controller in case of power loss and also SSD speed for the SLOG device.
Or will the share volume of data trash these types of SSD right away even if it's just a small lab environment?
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
Also how big SSD's would be needed for a SLOG in this setup?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
What if I buy two simple consumer grade SSD
Same problem, just slightly attenuated. SLOG devices need to keep latency to a minimum.

Also how big SSD's would be needed for a SLOG in this setup?
Small. Capacity isn't the driving force behind the choice, throughput, latency and consistency are.
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
Is it better then to skip the SLOG to start out with and see what performance I get out of the system?
If I then feel that the performance is to slow then I invest in a proper SLOG device?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Well, this is a lab. Do you really care about a few seconds of data loss if the server crashes or loses power?

You could just disable sync writes.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Is it better then to skip the SLOG to start out with and see what performance I get out of the system?
If I then feel that the performance is to slow then I invest in a proper SLOG device?
Yes. If performance is adequate without a SLOG, then problem solved without extra hardware. If not, SLOG is easy to add.
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
Well, this is a lab. Do you really care about a few seconds of data loss if the server crashes or loses power?

You could just disable sync writes.

I'm not really worried about losing data. But what will happen to my running VM's if the FreeNAS server goes down in the middle of a write? Will the whole VM then crash and not be usable anymore? If that is a high risk that my VM's will be corrupt then I would like to limit this risk as much as possible. If there is a minimal risk that my VM's will get corrupted in this scenario then I'm fine with this.
I really don't want to have to rebuild my VM's from backup time and time again if you catch my drift :)
 
Last edited by a moderator:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
But what will happen to my running VM's if the FreeNAS server goes down in the middle of a write? Will the whole VM then crash and not be usable anymore?
Depends on what the VM is doing. Same as on physical hardware with aggressive caching. It's probably at least somewhat likely. What's much less likely is the FreeNAS server going down unexpectedly.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
If your storage loses writes, you want your VMs to reset. Otherwise, rebuild or revert to a prior ZFS snapshot.

The battery RAID idea sounds interesting, but you still want to test what happens with a storage-only crash.
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
If your storage loses writes, you want your VMs to reset. Otherwise, rebuild or revert to a prior ZFS snapshot.

The battery RAID idea sounds interesting, but you still want to test what happens with a storage-only crash.

Thank you! I also think it's an interesting approach but this idea is not well liked :D
I haven't decided yet if I will go down this path or not.
I will keep you posted with my findings.

What would be a good test for this? I know based on reading other forum posts that CrystalDiskMark is not a popular way of bench-marking things :confused:
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Performance testing is up to you; but testing a crash of the storage is also very important before you rely on it.
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
Same problem, just slightly attenuated. SLOG devices need to keep latency to a minimum.

Can you please elaborate a bit more on this statement?
I can't really see how this can be a "bad" combo. I have the cache from the RAID controller that will take the primary hit, that then write to my SSD on the backend. Even if this is a slower consumer grade SSD (SATA3) it will still be waaaaay much faster than spinning disks.
What would the complete process end to end for a write to be with a SLOG?
As I see it should something like the following.

1. The ESXi servers connect to the FreeNAS datastore over iSCSI.
2. The write hits the H710 write cache that sends an ACK. (The ESXi is happy with this and continue with it's other workloads.)
3. The cache writes the data down to the SSD.
4. The SSD hand over the data to the primary datastore (mirrored HDD vdev) for safe keeping.

Is that process basically correct?
If so then I can't really see how even a consumer grade SSD can be "bad" in this case?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
it will still be waaaaay much faster than spinning disks
Not as fast as you'd expect, with this kind of workload. Consumer SSDs, even higher end models, almost always behave poorly in scenarios with constant streams of writes.
Is that process basically correct?
Yeah, if you can somehow guarantee that the acknowledgement is always sent in step 2. That's an exercise for the reader.
 

Vidis

Dabbler
Joined
Jan 25, 2017
Messages
21
I think I will start out with my mirrored vdev and put the iSCSI datastore for my VM's in sync=disabled state and hope that the power down go out to often. Have happened a few times in the past though.
Just need to verify that my backups are working properly :cool:
I'm currently backing up my VM's through Veeam. Since there is snapshot possibilities through FreeNAS now, would you continue to backup the VM through Veeam via the vCenter server? Or would you do this in a different way?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Using any kind of hardware RAID device prevents the drives from being properly managed by FreeNAS and hides important health information (SMART) from the OS. You are advised not to use any hardware RAID controllers.

This is definitely a concern for pool data drives.

However, for a SLOG device, loss of the SLOG drive results in ZFS writing the ZIL to the in-pool ZIL. As long as the system survives the loss of the RAID-based SLOG without doing something stupid like panicking, all you'll really see is the sudden slam-into-the-wall of the pool losing write performance. Because this is an unusual configuration, this could be a serious caveat.
 
Status
Not open for further replies.
Top