Performance using m.2 SLOG through C246 chipset

GeekGoneOld · Mar 21, 2021

I'm upgrading my FN systems to new TNC systems.

Existing Machine 1
Platform: Very old AMD processor, 12GB , M1015, Intel PCI 1gbe x2 (one for network, one dedicated link from VM host to datastore)
Purpose: Mostly VM datastore over NFS (4 vms on separate machine over 1gbe) and Plex in a jail (1 concurrent user, no 4k). Also SMB fileserver
Config: mirrored SATA SSD boot, 6x3T WD Red Pro as Z2, 2xS3500 120G as mirror slog

Existing Machine 2
Platform: Old Core 2 duo, 16GB, Intel PCI 1gbe x2
Purpose: Backup of Machine 1 and of one other remote machine.
Config: mirrored SATA SSD boot, 4x3T WD Red Pro as 2 mirror vdevs

This has served me very well for a long time (8 years?) but it is time for higher performance as follows:

1. Move Plex to a VM on the VM machine.
This will increase traffic over the dedicated datastore enet link from the VM host but it makes more sense as Plex doesn't support FreeBSD as well as other OSs and my VM machine is very powerful.

2. Change machine 1 to be higher performance even at the expense of possible loss of data (remember it is backed up)
I can afford to have downtime if I lose my pool though it would be a nuisance. To wit, I propose changing the pool to be 6x3T configured as 3 mirrored vdevs. Reasonably redundant but not as good as Z2.
Upgrade the datastore link to 10gbe direct connection (no switch)

3. Change machine 2 to be higher data security at the expense of performance.
Given that I prioritized machine 1 to be high performance, I must compensate by making machine 2 to be high security. To wit, I propose changing the pool to be 6x3T configured as Z2.

So I am currently waiting on a new X11SCH-F with E2246G processor (yes, overkill) and 64GB for machine 1. I will use an on-board enet port and fill one PCI slot with an Intel X520 10gbe. Even though this MOBO has 10 SATA ports, they are through the chipset which is limited to 4 lanes so I choose to fill the other PCI slot will be the M1015 for higher performance.

Now my questions:

1. Machine 1 certainly needs a SLOG for performance. I can use the S3500 mirror but it wouldn't come close to keeping up with 10gbe. My thought is to use Optane 4801X M.2 drives but the M.2 goes through the chipset. The only other thing on the chipset would be the 1gbe port. Will this (going through the chipset) ruin SLOG performance? What other config might be better?

2. Have I missed anything obvious? Am I way off base?

3. To do this, I will have to recreate my pools (since they are totally different confgs). I've been scheduling snapshot replication successfully but it has been years since I last had to mount a snapshot and recover a file. I don't even know where the snapshots are stored. Must be a hidden folder. There is also a remnant of my years-old restore in that the /mnt directory has a subdirectory of the main machine pool name ( think there are no files). So I ask, can someone give me a link to a GOOD backup/restore procedure (I can only find links to bad procedures) and explain why there is a remnant in my /mnt directory (I HATE orphaned stuff).

I'm all ears and appreciative of any guidance!
Keith

sretalla · Mar 22, 2021

GeekGoneOld said:
6x3T WD Red Pro as Z2, 2xS3500 120G as mirror slog

This is indeed not a good arrangement if the purpose of the SLOG is to deliver IOPS and your Z2 is restricting the throughput of IOPS, ultimately choking the performance after just a few seconds of high IO.

GeekGoneOld said:
I propose changing the pool to be 6x3T configured as 3 mirrored vdevs. Reasonably redundant but not as good as Z2.

This is a good proposal to improve the IOPS situation above, but as you say a change to the risk in terms of 2 simultaneous disk loss.

GeekGoneOld said:
I don't even know where the snapshots are stored.

On the disk where they were taken... they are a map of all the blocks that are being preserved from changes until the snapshot is destroyed/released, they don't contain the actual data (although they are the way to access the data).

You can see them if you use zfs list -t snapshot

If you have it enabled for your pools/datasets (zfs set snapdir=visible pool/dataset if not), you will also see a .zfs directory in the root of each dataset with snapshots taken and you can use it to get to a view of the content if you browse down that tree.

GeekGoneOld said:
a GOOD backup/restore procedure

0. Think about stopping all jails/VMs and shares before starting (optional, but obviously you risk missing added data if things are still happening while your backup runs)

1. Backup
a. Create a recursive snapshot of the dataset/pool (zfs snapshot -r pool@snapshotName)
b. Send that snapshot to the backup target (zfs send -R pool@snapshotName | zfs recv backupPool/backupTargetDataset ... optionally if the backup pool is in another machine, change the second part to | ssh root@server zfs recv backupPool/backupTargetDataset )
c. Optionally check your backup location for valid content

2. Make your pool changes (careful not to break the pool holding your backup).

3. Restore
a. Send the snapshot back to the new pool (zfs send -R backupPool/backupTargetDataset@snapshotName | zfs recv -F newPool ... optionally if the backup pool is in another machine, change the first part to ssh root@server zfs send -R backupPool/backupTargetDataset@snapshotName | ))
b. Optionally check your data

4. Start your Jails/VMs and shares again

GeekGoneOld said:
1. Machine 1 certainly needs a SLOG for performance. I can use the S3500 mirror but it wouldn't come close to keeping up with 10gbe. My thought is to use Optane 4801X M.2 drives but the M.2 goes through the chipset. The only other thing on the chipset would be the 1gbe port. Will this (going through the chipset) ruin SLOG performance? What other config might be better?

I wouldn't be worried about that. The bigger concern is the actual performance if the pool, which will likely be exposed under any kind of sustained load (where the SLOG can't absorb the peak for more than 5 seconds and the pool is forcing IO to stop as a result).

GeekGoneOld said:
explain why there is a remnant in my /mnt directory

I suspect you probably have a dataset on that server that is (or once was) mounted there. You can check zfs get mountpoint to see if it's still there.

GeekGoneOld · Mar 22, 2021

@sretalla Thanks very much for your reply.

sretalla said:
This is indeed not a good arrangement if the purpose of the SLOG is to deliver IOPS and your Z2 is restricting the throughput of IOPS, ultimately choking the performance after just a few seconds of high IO.

sretalla said:
This is a good proposal to improve the IOPS situation above, but as you say a change to the risk in terms of 2 simultaneous disk loss.

Nice to hear you confirm my expectation. As I am unwilling to compromise on permanent loss of most data BUT I'm totally willing to compromise on temporary loss of data and on loss of a day's work, I felt this was a great solution for me. My data is almost as well protected as it was before but I have the (very) slightly higher probability of losing today's data (if I have a particular 2 disk failure on machine 1). I'm very OK with that, especially if my performance increase is noticeable (which I would find very likely).

sretalla said:
[snapshots are kept] On the disk where they were taken... they are a map of all the blocks that are being preserved from changes until the snapshot is destroyed/released, they don't contain the actual data (although they are the way to access the data).

I phrased the question very poorly. I'm aware of the principle of snapshots but I actually meant where on the backup machine. Either way, your detailed answer answered my question perfectly.

sretalla said:
1. Backup
[...]
3. Restore

Very straight forward. Many thanks for this critical part. Must I clone the snapshot to put the snapshot in a dataset on the new pool?

sretalla said:
I wouldn't be worried about that. [m.2 handled by chipset not by processor]

Yeah. Kinda thought that but some things in architecture might seem minor and bite you when you least expect it. Because I'm not using much through the chipset I thought it might not be noticeable. I will try it.

sretalla said:
I suspect you probably have a dataset on that server that is (or once was) mounted there. You can check zfs get mountpoint to see if it's still there.

Yes, a mountpoint does show up. This backup machine only has replicated snapshots from machine 1. I did a clone to new dataset a few years ago (to restore a file) then I destroyed the dataset. Does FNC not destroy the mountpoint when you destroy the dataset? Is there any harm in the mountpoint being there? How do I destroy it if I want to? Should I (is it safe)?

sretalla · Mar 23, 2021

GeekGoneOld said:
Must I clone the snapshot to put the snapshot in a dataset on the new pool?

No, that's not necessary. The send | recv command is doing that work. (although officially it's not a clone, it's an identical block-level copy).

That does remind me that at the end of the whole thing, you can delete the snapshot you used to move the data though:

zfs destroy -r pool@snapshotName
and
zfs destroy -r newPool@snapshotName

GeekGoneOld said:
Does FNC not destroy the mountpoint when you destroy the dataset?

In this case, seems not.

GeekGoneOld said:
s there any harm in the mountpoint being there?

No

GeekGoneOld said:
How do I destroy it if I want to?

Check that it's empty first, then:
rm -r /mnt/whatever

GeekGoneOld said:
Should I (is it safe)?

If zfs needs a place to mount it should normally be able to create it when the time comes, so in that sense, no risk. Having an empty directory sitting there isn't too harmful either.

You can always mkdir /mnt/whatever at any time if you want it back.

GeekGoneOld · Mar 23, 2021

@sretalla Many thanks. I am now confident I have what I need, parts and procedures!

Important Announcement for the TrueNAS Community.

Performance using m.2 SLOG through C246 chipset

GeekGoneOld

Dabbler

sretalla

Powered by Neutrality

GeekGoneOld

Dabbler

sretalla

Powered by Neutrality

GeekGoneOld

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Performance using m.2 SLOG through C246 chipset

GeekGoneOld

Dabbler

sretalla

Powered by Neutrality

GeekGoneOld

Dabbler

sretalla

Powered by Neutrality

GeekGoneOld

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Performance using m.2 SLOG through C246 chipset"

Similar threads