Configuration Advice for PMEM

jhartbarger

Dabbler
Joined
Apr 3, 2023
Messages
13
Looking for some configuration advice on how to configure the PMem100 modules along with my NVME drives for best performance and reliability. From all my reading about ZFS, Pools, VDEVs, SLOGs, etc I have come to the understanding that nothing should be faster or have a lower queue depth for special VDEV use than the PMem modules as they appear to trounce even the 5800x. I currently have 12 256Gb modules in non-interleaved mode (when interleaved I get 2 1TB drives)

Any insight or advice is appreciated and thanks for taking the time to respond if you do.

truenas.png
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Those are some extremely fast modules, and you're sporting twelve of them, in addition to the quartet of SN640's. There's a solid amount of incredibly fast storage on tap there.

For PMEM configuration, I'd suggest leaving them in non-interleaved (independent) mode; that way you can split them up into different redundancy groups if desired, rather than just having two logical 1T devices.

What's the intended workload for this storage server, generally speaking? While Optane PMEM DIMMs will handle just about everything you can throw at them, if your workload doesn't demand the IO intensive things like deduplication or sync writes, they might just become a pool of Even More Ridiculously Fast Storage.
 

jhartbarger

Dabbler
Joined
Apr 3, 2023
Messages
13
Those are some extremely fast modules, and you're sporting twelve of them, in addition to the quartet of SN640's. There's a solid amount of incredibly fast storage on tap there.
Yes 12 of them in this machine along with the 4 SN640's, Along with those 12 I have 6 more 256Gb's, 20 128Gb's, and 8 of the 512Gb's.

For PMEM configuration, I'd suggest leaving them in non-interleaved (independent) mode; that way you can split them up into different redundancy groups if desired, rather than just having two logical 1T devices.
That good news and was where my thought process had taken based on all my reading.

What's the intended workload for this storage server, generally speaking? While Optane PMEM DIMMs will handle just about everything you can throw at them, if your workload doesn't demand the IO intensive things like deduplication or sync writes, they might just become a pool of Even More Ridiculously Fast Storage.
Well in all honesty this R740 was built to replace my dinosaur R710 and while building it I wanted to play with the PMem modules and considering the price I got them at (less than $1000 for everything listed above) all I had to do was dig up some CPU's that had PMem support.

With that said it will be used for my new local NAS, host my containers along with a few local VM's and maybe a few LUNS via ISCSI to my ESXi Cluster. I have considered configuring deduplication to see how it performs. The system has 768GB of RAM and dual Xeon Gold 5218T 16C/32T's as well so I am confident it can handle it.
 

jhartbarger

Dabbler
Joined
Apr 3, 2023
Messages
13
The thing I cannot figure out is why only the first 4 PMem modules will show their serial numbers and the last 8 don't.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Well in all honesty this R740 was built to replace my dinosaur R710 and while building it I wanted to play with the PMem modules and considering the price I got them at (less than $1000 for everything listed above) all I had to do was dig up some CPU's that had PMem support.

With that said it will be used for my new local NAS, host my containers along with a few local VM's and maybe a few LUNS via ISCSI to my ESXi Cluster. I have considered configuring deduplication to see how it performs. The system has 768GB of RAM and dual Xeon Gold 5218T 16C/32T's as well so I am confident it can handle it.
There's no kill like overkill. For your iSCSI LUNs, setting up sync=always on those and attaching a pair of the PMEM devices as an SLOG will likely help the latency, as they're likely speedier in the small-block writes than the SN640s will be.

For deduplication, the PMEMs will also do spectacularly as dedicated dedup devices - just bear in mind that if you use a RAIDZ for your main data vdev, you'll be unable to remove dedup vdevs after you add them. You can grow the vdevs and replace devices them for larger ones (eg: replacing a 256G with 512G DIMM) but never totally remove the vdev without wiping the data pool and starting over.

The thing I cannot figure out is why only the first 4 PMem modules will show their serial numbers and the last 8 don't.
Do they show up in the shell if you run a regular smartctl -a /dev/pmem4 against them?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Whatever you do will be very interesting to follow. But these modules can only be used with 1st/2nd gen. Xeon Scalable, so if you do use them for data storage you'll never be able to move your pool to another class of hardware.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
I'm just drooling at the hardware
 

jhartbarger

Dabbler
Joined
Apr 3, 2023
Messages
13
There's no kill like overkill. For your iSCSI LUNs, setting up sync=always on those and attaching a pair of the PMEM devices as an SLOG will likely help the latency, as they're likely speedier in the small-block writes than the SN640s will be.
I had that thought that was a optimal solution, thanks for the confirmation

For deduplication, the PMEMs will also do spectacularly as dedicated dedup devices - just bear in mind that if you use a RAIDZ for your main data vdev, you'll be unable to remove dedup vdevs after you add them. You can grow the vdevs and replace devices them for larger ones (eg: replacing a 256G with 512G DIMM) but never totally remove the vdev without wiping the data pool and starting over.

Well that was my next mission to determine if dRAID isn't a better option here. Thoughts?

Do they show up in the shell if you run a regular smartctl -a /dev/pmem4 against them?

There does not appear to be a valid device type for smartctl for pmem
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Well that was my next mission to determine if dRAID isn't a better option here. Thoughts?
You need to list your drives and explain your storage requirements.
So far I do not see the 20+ drives which could make dRAID a possible option.
 

jhartbarger

Dabbler
Joined
Apr 3, 2023
Messages
13
Whatever you do will be very interesting to follow. But these modules can only be used with 1st/2nd gen. Xeon Scalable, so if you do use them for data storage you'll never be able to move your pool to another class of hardware.

Thanks I also thought about that (actually 1st/2nd/3rd gen Xeon) but this system is so overbuilt the chances of moving it for a very very long time are slim. I should also point out that I will be backup/replicating to another TrueNAS installation
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
3rd gen. can only use PMem200, not PMem100. (No wonder the technology was a hard sell…)
I agree that the system should serve you for quite some time. The problematic scenario would be a motherboard failure, where you might need to move the (hopefully healthy) pool to a new system.
 

jhartbarger

Dabbler
Joined
Apr 3, 2023
Messages
13
3rd gen. can only use PMem200, not PMem100. (No wonder the technology was a hard sell…)
I agree that the system should serve you for quite some time. The problematic scenario would be a motherboard failure, where you might need to move the (hopefully healthy) pool to a new system.

There is actually a PMem300 as well and I agree on the hardware failure scenario but like I said above it will be be replicated/backed-up, its a Dell R740 and I already have 2 spare barebone R740 motherboards and spare PMem100 modules.
 

jhartbarger

Dabbler
Joined
Apr 3, 2023
Messages
13
You need to list your drives and explain your storage requirements.
So far I do not see the 20+ drives which could make dRAID a possible option.

For this scenario dRAID wouldn't be an option, but lets say we remove the 4 SN640's for the sake of discussion and replace them with say 10-20 1.92TB SATA SSD's.
 
Top