SLOG m.2 OR HHHL ?

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
Hey friends,
I am going to purchase a device for SLOG use and I am considering the following 2 options:
SSDPED1K375GA01 INTEL Optane SSD DC P4800X 375GB 1/2 Height PCIe x4 3D XPoint Single Pack
OR
SSDPEL1K200GA01 INTEL Optane SSD DC P4801X 200GB M.2 110MM PCIe x4 3D Xpoint Single Pack

If I buy the M.2 version I will also need to purchase the following adapter - Supermicro AOC-SLG3-2M2-O
For latency and write IOPS the best is INTEL Optane SSD DC P4800X 375GB. But with the second option (M.2) I can expand in the future and use another M.2, what would you do? Are the differences critical to SLOG and will affect performance if I choose an adapter with M.2?
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Have you carefully thought about your workload to the point where your QD1-8 write latency is CERTAINLY so important? Would a power loss protection SAS or NVMe SSD of larger size be sufficient for the same or a lower price?

If future expansion is a big concern, I'd go with the M.2 option. Heck there are 4x4 NVMe M.2 riser adapters you could potentially look at, so instead of being able to expand by just 1 more, you could expand by 3 more natively, plus 1 extra if you have a 2280 slot on the board itself.
 

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
Have you carefully thought about your workload to the point where your QD1-8 write latency is CERTAINLY so important? Would a power loss protection SAS or NVMe SSD of larger size be sufficient for the same or a lower price?

If future expansion is a big concern, I'd go with the M.2 option. Heck there are 4x4 NVMe M.2 riser adapters you could potentially look at, so instead of being able to expand by just 1 more, you could expand by 3 more natively, plus 1 extra if you have a 2280 slot on the board itself.

I am running ISCSI to XEN with VMs and databases and etc..
What I read and understood most importantly in SLOG are 2 things:
1. Response time / latency.
2. IOPS
3. Power loss protection

The question is if the latency time between the disks is critical to SLOG?
SSDPED1K375GA01 has 550000 IOPS read and 500000 IOPS write with 10µs.
SSDPED1K375GA01 has 550000 IOPS read and 400000 IOPS write with 10µs for read and 12µs for write.
What you think? What to choose?
I'm sure Supermicro AOC-SLG3-2M2-O increases response time than connecting directly as in HHHL, do you think this is true?
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
I am running ISCSI to XEN with VMs and databases and etc..
What I read and understood most importantly in SLOG are 2 things:
1. Response time / latency.
2. IOPS
3. Power loss protection

The question is if the latency time between the disks is critical to SLOG?
SSDPED1K375GA01 has 550000 IOPS read and 500000 IOPS write with 10µs.
SSDPED1K375GA01 has 550000 IOPS read and 400000 IOPS write with 10µs for read and 12µs for write.
What you think? What to choose?
I'm sure Supermicro AOC-SLG3-2M2-O increases response time than connecting directly as in HHHL, do you think this is true?
I imagine it will increase response times by a microsecond or 2, but I doubt it's dramatic.

If that's your workload, more power to you.
 

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
I imagine it will increase response times by a microsecond or 2, but I doubt it's dramatic.

If that's your workload, more power to you.
Is which option do you think is the right one for me?
I'm really debating and afraid to buy just because these are very expensive disks.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Is which option do you think is the right one for me?
I'm really debating and afraid to buy just because these are very expensive disks.
If you think the disks are too expensive for what they are, NVMe disks in general already have great IOPS. Not quite the ultra low QD1-8 latency that Optane enjoys, but still very low compared to SAS/SATA. Maybe look at some 256GB M.2 drives with power loss protection?

SLOG devices traditionally don't have to be very large, and Optane gen 2 is going to be hitting shelves sometime in the next 3 years with larger capacities at lower prices. If it were me and my money, I'd go with a Crucial M.2 and see how it goes, because you can always use it as a boot drive for a later machine if it doesn't perform well enough as SLOG, and it's not going to be much money lost.
 

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
If you think the disks are too expensive for what they are, NVMe disks in general already have great IOPS. Not quite the ultra low QD1-8 latency that Optane enjoys, but still very low compared to SAS/SATA. Maybe look at some 256GB M.2 drives with power loss protection?

SLOG devices traditionally don't have to be very large, and Optane gen 2 is going to be hitting shelves sometime in the next 3 years with larger capacities at lower prices. If it were me and my money, I'd go with a Crucial M.2 and see how it goes, because you can always use it as a boot drive for a later machine if it doesn't perform well enough as SLOG, and it's not going to be much money lost.

Thanks for your response!
I have a budget for the disks that I mentioned the concern that I have is to choose the right disk in all respects and especially for the future view.
Both disks have power loss protection.
So I think I go with the m.2 and the adapter, What you think ?
Thanks.
 

amp88

Explorer
Joined
May 23, 2019
Messages
56
I am going to purchase a device for SLOG use and I am considering the following 2 options:
SSDPED1K375GA01 INTEL Optane SSD DC P4800X 375GB 1/2 Height PCIe x4 3D XPoint Single Pack
OR
SSDPEL1K200GA01 INTEL Optane SSD DC P4801X 200GB M.2 110MM PCIe x4 3D Xpoint Single Pack
Either of those options should work well for SLOG use. There's a thread with various SLOG device benchmarks, and each of those two options appear multiple times in the thread with very good benchmark results. I would personally go for the M.2 option, as it provides more expandability options in the future. However, something you seem not to have considered is that with your workload ("I am running ISCSI to XEN with VMs and databases and etc"), if you have a failure of your SLOG device then you're almost certainly going to have some problems/inconsistency of the pool data, by losing the data in the SLOG. I believe you'd be losing up to around 5 seconds worth of writes, so you have to weigh the impact of what that would do to your data. Failures of Optane devices are low, but when you're working with VMs and database applications, losing even a few megabytes could result in significant inconsistency.

If it were me and my money, I'd go with a Crucial M.2 and see how it goes, because you can always use it as a boot drive for a later machine if it doesn't perform well enough as SLOG, and it's not going to be much money lost.
You shouldn't recommend devices without PLP (Power Loss Protection) for SLOG use, as they're inherently unsafe in the case of a sudden power loss or system crash/panic.
 

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
Either of those options should work well for SLOG use. There's a thread with various SLOG device benchmarks, and each of those two options appear multiple times in the thread with very good benchmark results. I would personally go for the M.2 option, as it provides more expandability options in the future. However, something you seem not to have considered is that with your workload ("I am running ISCSI to XEN with VMs and databases and etc"), if you have a failure of your SLOG device then you're almost certainly going to have some problems/inconsistency of the pool data, by losing the data in the SLOG. I believe you'd be losing up to around 5 seconds worth of writes, so you have to weigh the impact of what that would do to your data. Failures of Optane devices are low, but when you're working with VMs and database applications, losing even a few megabytes could result in significant inconsistency.


You shouldn't recommend devices without PLP (Power Loss Protection) for SLOG use, as they're inherently unsafe in the case of a sudden power loss or system crash/panic.

Hey,
Thanks for your response.
Ya, I think about mirror the SLOG but I read that it's no longer relevant and fixed it in freenas and when the disk fails the data goes to RAM, It's right ? or I still need to mirror them ?
For VT I understand that the size block the often use is 4K,8K and 64K I look at thread of SLOG performance and I see the SSDPED1K375GA01 is better, What you think?
I don't find benchmark results for Intel Optane P4801X 375GB, It has to be faster, have you seen such a benchmark?

Sorry for my bad English.
 

amp88

Explorer
Joined
May 23, 2019
Messages
56
Ya, I think about mirror the SLOG but I read that it's no longer relevant and fixed it in freenas and when the disk fails the data goes to RAM, It's right ? or I still need to mirror them ?
I'm willing to be corrected, but my understanding is that if your SLOG device fails, FreeNAS will revert to writing its ZIL on your pool (as it would do if you'd never assigned a SLOG). However, data which had been written to the SLOG just before it failed would not be written to the pool, which could cause inconsistency between your pool and what your VM clients would expect. If it wrote the ZIL to RAM in the event of a SLOG failure that would be tremendously unsafe.

For VT I understand that the size block the often use is 4K,8K and 64K I look at thread of SLOG performance and I see the SSDPED1K375GA01 is better, What you think?
I don't find benchmark results for Intel Optane P4801X 375GB, It has to be faster, have you seen such a benchmark?
You can search within the thread I linked above. Go to the thread, click on the "Search" link at the top of the forum (next to the links for your user profile, Inbox, and Alerts) then type in the part name you want to search for (e.g. "P4801X") and when you hit Search you should find what you're looking for.
 

Attachments

  • search_in_thread.png
    search_in_thread.png
    12.9 KB · Views: 315

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
I'm willing to be corrected, but my understanding is that if your SLOG device fails, FreeNAS will revert to writing its ZIL on your pool (as it would do if you'd never assigned a SLOG). However, data which had been written to the SLOG just before it failed would not be written to the pool, which could cause inconsistency between your pool and what your VM clients would expect. If it wrote the ZIL to RAM in the event of a SLOG failure that would be tremendously unsafe.

You right. I need to buy one more m.2 stick to be safe.
Its going to be very expensive, You have good alternative that less expensive ?
Its must to be m.2 stick.

You can search within the thread I linked above. Go to the thread, click on the "Search" link at the top of the forum (next to the links for your user profile, Inbox, and Alerts) then type in the part name you want to search for (e.g. "P4801X") and when you hit Search you should find what you're looking for.

I searched and not found results and not in Google, You may know where I can find ?

Another question I want also upgrade my disks, My FreeNAS system:
System - Supermicro SYS-1028U-E1CR4+
Controller - AOC-S3008L-L8e (IT)
Disks - Samsung SM863A ,1.9TB,SATA 6Gb/s,VNAND,V48,2.5" X 6

Because my system support up to 10 disks SAS SSD 12Gbps I want to upgrade my disks to SAMSUNG PM1643 SAS Enterprise SSD 1.92 TB internal 2.5 inch SAS 12Gb/s TLC.
What you think ? Do you have may better alternatives ?

Thanks!
 

amp88

Explorer
Joined
May 23, 2019
Messages
56
You right. I need to buy one more m.2 stick to be safe.
Its going to be very expensive, You have good alternative that less expensive ?
Its must to be m.2 stick.



I searched and not found results and not in Google, You may know where I can find ?

Another question I want also upgrade my disks, My FreeNAS system:
System - Supermicro SYS-1028U-E1CR4+
Controller - AOC-S3008L-L8e (IT)
Disks - Samsung SM863A ,1.9TB,SATA 6Gb/s,VNAND,V48,2.5" X 6

Because my system support up to 10 disks SAS SSD 12Gbps I want to upgrade my disks to SAMSUNG PM1643 SAS Enterprise SSD 1.92 TB internal 2.5 inch SAS 12Gb/s TLC.
What you think ? Do you have may better alternatives ?

Thanks!
Here's a benchmark of the 200GB P4801X (which, according to Intel's specs provides about 70% of the peak write performance of the 375GB version). At lower write sizes (e.g. 8KB and below) it's massively faster than 'traditional' NAND flash M.2 NVME drives (about 10-15x/an order of magnitude). The Optane drives are very expensive, but for the moment it doesn't seem there's any non-exotic solution (e.g. battery-backed DRAM) which can compete with them; if you need that level of performance, the Optane drives are your only option. You could go with 'traditional' NAND flash devices instead to save money, but at the cost of performance. You'll need to decide if that's worth it for your application. Serve The Home did some testing in 2017 you may want to take a look at. It's a bit out of date now (missing a few new releases), but still contains pertinent information.

In terms of the storage SSDs, those Samsung PM1643 drives will probably work well for you, and your controller will be capable of utilising the increased 12Gbps the drives provide compared to the SM863A at 6Gbps. I don't know if there are any better options out there though.
 

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
Here's a benchmark of the 200GB P4801X (which, according to Intel's specs provides about 70% of the peak write performance of the 375GB version). At lower write sizes (e.g. 8KB and below) it's massively faster than 'traditional' NAND flash M.2 NVME drives (about 10-15x/an order of magnitude). The Optane drives are very expensive, but for the moment it doesn't seem there's any non-exotic solution (e.g. battery-backed DRAM) which can compete with them; if you need that level of performance, the Optane drives are your only option. You could go with 'traditional' NAND flash devices instead to save money, but at the cost of performance. You'll need to decide if that's worth it for your application. Serve The Home did some testing in 2017 you may want to take a look at. It's a bit out of date now (missing a few new releases), but still contains pertinent information.

In terms of the storage SSDs, those Samsung PM1643 drives will probably work well for you, and your controller will be capable of utilising the increased 12Gbps the drives provide compared to the SM863A at 6Gbps. I don't know if there are any better options out there though.

Thanks men you the best!
It really helps me a lot!
After all your help and the research I did, I came to the decision that I am debating between two NVME M.2 and according to my budget:
The debate is that 905p 380 is faster but not enterprise and the P4801X 200GB is slower but enterprise and gives the feeling that it is more safe, how about? Worth taking the risk with 905p 380 because of the speed? Of course I intend to purchase 2.
My server won't have L2ARC because I no longer have port PCI available, will it affect my performance? Do I need L2ARC at all in my situation?

Thanks!
 

amp88

Explorer
Joined
May 23, 2019
Messages
56
Thanks men you the best!
It really helps me a lot!
After all your help and the research I did, I came to the decision that I am debating between two NVME M.2 and according to my budget:
The debate is that 905p 380 is faster but not enterprise and the P4801X 200GB is slower but enterprise and gives the feeling that it is more safe, how about? Worth taking the risk with 905p 380 because of the speed? Of course I intend to purchase 2.
My server won't have L2ARC because I no longer have port PCI available, will it affect my performance? Do I need L2ARC at all in my situation?

Thanks!
If I was going to be putting my own money down and I didn't need to use "Enterprise class" components (e.g. as part of a purchase order, Service Level Agreement or other legal necessity), I'd probably go for the 905p M.2 drives. Often "Enterprise class" components do offer something over the regular consumer or "prosumer" hardware, but the 905p has the 5 year warranty and the write endurance (10 DWPD/~7PB over 5 years), so I wouldn't worry about it in this particular case. You may want to read Serve The Home's review of the 380GB 905p. Some selected quotes:

In servers, the Intel Optane 380GB M.2 device is the highest endurance (~7PBW) option currently available in a M.2 22110 drive. It is also the fastest M.2 logging device for sync writes that occur in server applications. Today that wait is over and we have a retail purchased Intel Optane 905P 380GB NVMe SSD. If you run a ZFS storage array and need a ZIL/ SLOG device in an M.2 slot, this is it. If you have small databases in the few hundred MB to few GB range that need solid write performance, this is the drive you want. ... If you have the budget, and your job depends on it, just get the P4800X. If you are on a tight budget, the lower-end Intel Optane drives are great.
(my emphasis)

The last page of the review also has a section titled "Intel Optane 905P 380GB ZFS ZIL/ SLOG Test", which is of particular interest to this discussion.

As to the question of L2ARC, read caching performance for FreeNAS/ZFS is one of those very complex topics, and it can be practically impossible to give concrete answers without testing using very similar hardware and workloads. My (basic) understanding is that adding system RAM can provide more effective read caching performance than L2ARC, but this may not be possible in your case (depending on whether you have any free DIMM slots). If you can't expand your system RAM's capacity, you might want to consider using one of the SAS ports in the system for an L2ARC drive, but whether or not you see a benefit there will depend on your workload, so it's hard to say.
 

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
If I was going to be putting my own money down and I didn't need to use "Enterprise class" components (e.g. as part of a purchase order, Service Level Agreement or other legal necessity), I'd probably go for the 905p M.2 drives. Often "Enterprise class" components do offer something over the regular consumer or "prosumer" hardware, but the 905p has the 5 year warranty and the write endurance (10 DWPD/~7PB over 5 years), so I wouldn't worry about it in this particular case. You may want to read Serve The Home's review of the 380GB 905p. Some selected quotes:


(my emphasis)

The last page of the review also has a section titled "Intel Optane 905P 380GB ZFS ZIL/ SLOG Test", which is of particular interest to this discussion.

As to the question of L2ARC, read caching performance for FreeNAS/ZFS is one of those very complex topics, and it can be practically impossible to give concrete answers without testing using very similar hardware and workloads. My (basic) understanding is that adding system RAM can provide more effective read caching performance than L2ARC, but this may not be possible in your case (depending on whether you have any free DIMM slots). If you can't expand your system RAM's capacity, you might want to consider using one of the SAS ports in the system for an L2ARC drive, but whether or not you see a benefit there will depend on your workload, so it's hard to say.

Thank you very much for your help!
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
If an SLOG breaks absolutely nothing bad will happen to your data. As long as the system is still running.
The SLOG is only ever (!) read at crash recovery, e.g. an unexpected reboot, power loss, ...

HTH,
Patrick
 

dror

Dabbler
Joined
Feb 18, 2019
Messages
43
If an SLOG breaks absolutely nothing bad will happen to your data. As long as the system is still running.
The SLOG is only ever (!) read at crash recovery, e.g. an unexpected reboot, power loss, ...

HTH,
Patrick

But I set sync = write Doesn't that mean the SLOG is always in action?
If not, Then I don't have to buy two and make them a mirror, right?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
It is always in action, but the data is never read again, unless there is a crash and ZFS tries to commit as many transactiions as possible. It's a transaction log.

The transactions are eventually committed to your data disks, but for that they are not read from the SLOG but simply from RAM as long as things run smoothly.

See the ZFS primer in the FreeNAS documentation for details.

Edit: someone with more experience in hardware might clarify the chances of your system crashing, when the SLOG device fails. I can imagine simple implementations not quite liking PCIe devices to "go bad" whatever that means. I still remember the days when an IDE hard disk failure would freeze a server. We mirrored them, nonetheless, to recover more quickly.
But again, as long as the system is humming along, the SLOG is never read. Seriously ;)

Patrick
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Depends on your HA requirements ... and if the system as a whole will keep running if you figuratively pull one of the devices while powered on.
 
Top