SSD Pool Questions and Possible Bad SSD

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Background: I am in the process of performing some upgrades and redoing my system. Currently, I have two ESXi servers in a homelab connecting to my TrueNAS system. The systems are to help me learn and mess around with as I work in IT. I have also started using them for a few personal items like plex media server and a game server. I have added a few SSDs I had lying around to create an SSD pool for running virtual machines and then utilize my main storage pool as just file level storage over SMB for my plex server.

After adding the three SSDs to their own RAIDZ1 Pool, I created an iSCSI LUN and attached them to my ESXi servers. I vMotioned a single VM to the pool for just testing purposes and I was getting pretty slow performance. I used gstat -dp to monitor the individual disk usage and noticed that a single SSD (ada1) was reporting busy while the others were barely doing anything. This was also the case when performing an iozone test on the pool. This SSD in question is a Samsung 850 pro. I would expect an 850 pro to perform better than this. vMotion speeds to the SSD pool was about 80MB/sec or less. These are all older drives, so I am wondering if this one is starting to degrade and is on the failure path.

I vMotioned the VM back to my normal spinning storage iSCSI pool and read speeds were great. 400MB/sec read from SSD and 400MB/sec write to spinning pool. The bottleneck in that instance was the 10gb network. I then recreated the pool in a mirror without ada1 using just the other two in a mirror. I performed the vMotion again and both drives were busy and being fully utilized. This new vMotion was about 200-250MB/sec write to the SSD pool.

I am also guessing performance is going to suffer because the 3 SSDs are all different models.
I have:
ada1 - Samsung 850 Pro
ada2 - crucial MX 300
ada3 - Samsung 850 Evo

My goal for an SSD pool is just generally more fast and responsive OS's for the VMs I am running.

Questions:

1. Is there anything else that can be wrong with the 3 drive SSD pool with a single busy drive? Or is it a combo of probably bad drive and using multiple different models?

2. Any recommendations of SSDs to use in a 3 drive RAIDz1 pool? I would prefer to stick with standard consumer drives as I don't have a whole lot of money for this upgrade. Or are standard consumer drives a bad idea no matter what? I don't have a massive workload. I have:
  • 2 Domain Controllers for testing
  • an online running intermediate CA
  • a powered off offline root CA
  • A Veeam backup server that runs off a local SSD on the esxi host and a veeam proxy running on truenas storage (backs up to a local RAID controllers on that host separate of truenas)
  • A pfsense router
  • An Ubuntu reverse Proxy
  • An Ubuntu Web server
  • An Ubuntu game hosting server, Minecraft, Valheim, Zomboid (only about 3-5 friends play on them at max and only utilized about 2-4 times per week)
  • A RedHat email server for testing (less than 2 emails a week)
  • A vCenter
  • An Ubuntu Media server running the arrs (main media storage will be on spinning storage and I would like to run the OS on the SSD pool)

3. Any special settings or anything I need to take into consideration for an SSD pool vs a traditional HDD pool?
 

Attachments

  • bad ssd.png
    bad ssd.png
    164.7 KB · Views: 109

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,107
The most obvious wrong thing is that iSCSI would do better on mirrors than on a raidz1. Can you add a fourth SSD?
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
The most obvious wrong thing is that iSCSI would do better on mirrors than on a raidz1. Can you add a fourth SSD?
I would rather not do mirrors for SSD pool as I don't want to give up the storage space. I understand it is best to do mirrors for iSCSI, but wouldn't the drives being SSD be able to help offset some of that with their extra performance? Also, I don't think I am physically able to add more drives to my current system as it is out of connectors on the PSU.

Total space for SSD needs right now is about 1TB max. About 400GB actual usage and then room to grow in the future. The 3 drives I have are all 500GB drives.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,107
If you're out of power connectors but NOT of SATA ports, a simple Y-splitter can solve the issue. Four 500 GB drives as a pair of mirrors give the same 1 TB of raw space (ca. 50% usable for block storage) as your 3-wide raidz1 but twice the IOPS.
Throwing better hardware (SSD vs. HDD) at a configuration which is known to be not optimal (raidz for block storage) and still asking for performance is bound to end up as an exercise in frustration.
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
If you're out of power connectors but NOT of SATA ports, a simple Y-splitter can solve the issue. Four 500 GB drives as a pair of mirrors give the same 1 TB of raw space (ca. 50% usable for block storage) as your 3-wide raidz1 but twice the IOPS.
Throwing better hardware (SSD vs. HDD) at a configuration which is known to be not optimal (raidz for block storage) and still asking for performance is bound to end up as an exercise in frustration.
Alright, I will look into running Mirrors. Wouldn't the 10gb link be the bottleneck before I even reach iops limits on the SSDs for the amount of VMs I am running? Most are not actually doing anything at the same time. I could probably get away with 4 SSDs if I had to buy 4 more, but I would like to keep the budget for this upgrade under $250.

In regards to the current setup I tested with

1. With ADA1 always being busy, could that be indicative of a bad drive? Or something else going on? Is there a way to test to see if the drive is bad? SMART doesn't report any errors, but I would expect an 850 Pro to perform better than the other drives.

2. Any recommendations for SSDs if I should use all the same model? Or just any SSD from a reputable brand?
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
@HoneyBadger I am currently testing with an SSD pool with iSCSI and changed it from Sync Standard to always and it brought the pool to it's knees. Like 70MB write sequential. I will probably have to retest when I get the proper drives for the pool.

Also, any other thoughts on the other questions in this thread?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,107
Sync writes are a known performance killer, but 70 MB/s is severe. Is that still with a raidz1? (aggravating factor due to write amplification for parity on small blocks)
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Sync writes are a known performance killer, but 70 MB/s is severe. Is that still with a raidz1? (aggravating factor due to write amplification for parity on small blocks)
No, that was with SSD Mirror only. Any time I try to put that 850 pro back into the pool in a Raidz1, it hurts performance and becomes the bottleneck while the other drives basically do nothing.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
@HoneyBadger I am currently testing with an SSD pool with iSCSI and changed it from Sync Standard to always and it brought the pool to it's knees. Like 70MB write sequential. I will probably have to retest when I get the proper drives for the pool.

Also, any other thoughts on the other questions in this thread?

70MB/s on consumer SSD with sync writes on wouldn't be unexpected. Default iSCSI ZVOL values are a 16K volblocksize meaning that even larger "sequential" writes will get broken up into pieces of maximum 16K, and none of the SSDs listed have power-loss-protection for in-progress writes - the data will have to land on the NAND before being acknowledged.

Are your SSDs attached to the onboard SATA on the S1200BTL? None of them likely support the "deterministic read zero after TRIM" needed to make TRIM work properly behind the LSI HBA, so they could be struggling with their own internal garbage collection routines getting in the way of new writes as well. The 850 PRO might, but it seems to have other issues - try a camcontrol identify daX against it and look for a "deterministic TRIM" or "zero after TRIM" line that should be "true/yes"

2. Any recommendations for SSDs if I should use all the same model? Or just any SSD from a reputable brand?

Users have had a lot of success with the WD Blue/Red line of SATA SSDs, as well as the Samsung 860 series. Refurbished/lifecycled datacenter drives work well too (Intel DC, Toshiba, HGST) but be aware of them potentially having had hard lives before decommissioning.
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Are your SSDs attached to the onboard SATA on the S1200BTL?
Yes, all SSDs are attached directly to the onboard SATA on the motherboard. None are attached to the HBA.

The 850 PRO might, but it seems to have other issues - try a camcontrol identify daX against it and look for a "deterministic TRIM" or "zero after TRIM" line that should be "true/yes"
The only thing I found for "TRIM" in the feature list of the 850 pro was "Data Set Management (DSM/TRIM) yes". Full feature list attached.

Code:
Feature                      Support  Enabled   Value           Vendor
read ahead                     yes      yes
write cache                    yes      yes
flush cache                    yes      yes
Native Command Queuing (NCQ)   yes              32 tags
NCQ Priority Information       no
NCQ Non-Data Command           no
NCQ Streaming                  no
Receive & Send FPDMA Queued    yes
NCQ Autosense                  no
SMART                          yes      yes
security                       yes      no
power management               yes      yes
microcode download             yes      yes
advanced power management      no       no
automatic acoustic management  no       no
media status notification      no       no
power-up in Standby            no       no
write-read-verify              yes      no      0/0x0
unload                         no       no
general purpose logging        yes      yes
free-fall                      no       no
sense data reporting           no       no
extended power conditions      no       no
device statistics notification no       no
Data Set Management (DSM/TRIM) yes
DSM - max 512byte blocks       yes              8
DSM - deterministic read       no
Trusted Computing              yes
encrypts all user data         yes
Sanitize                       no
Host Protected Area (HPA)      yes      no      1000215216/1000215216
HPA - Security                 yes      no
Accessible Max Address Config  no


Users have had a lot of success with the WD Blue/Red line of SATA SSDs, as well as the Samsung 860 series. Refurbished/lifecycled datacenter drives work well too (Intel DC, Toshiba, HGST) but be aware of them potentially having had hard lives before decommissioning.
I am hesitant about refurbed drives. I was looking around and I also saw 1TB Crucial MX500 that look to be similar in price and performance to the WD Blues. This would allow me to do 2 vdevs of 2 mirrors to gain a little bit of performance. By chance have you or anyone else had bad experience with these?

Is there anything particular to look out for when trying to find SSDs to put in a pool like this?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
Yes, all SSDs are attached directly to the onboard SATA on the motherboard. None are attached to the HBA.
OK. Should make the TRIM behind LSI HBA a non-issue then, but if the drives came straight from a consumer workload and never got a chance to be hit with proper secure-erase/TRIM (as I think pool autotrim is disabled by default in 12.x and later) the internal firmware could still be confused as to whether or not it needs to care about what exists on the NAND now.

The only thing I found for "TRIM" in the feature list of the 850 pro was "Data Set Management (DSM/TRIM) yes". Full feature list attached.

Code:
Feature                      Support  Enabled   Value           Vendor
Data Set Management (DSM/TRIM) yes
DSM - max 512byte blocks       yes              8
DSM - deterministic read       no

The line you're looking for is "deterministic read" which should be either "yes" or "zero" depending on the drive (I think) in order for it to TRIM behind an LSI HBA.

I am hesitant about refurbed drives. I was looking around and I also saw 1TB Crucial MX500 that look to be similar in price and performance to the WD Blues. This would allow me to do 2 vdevs of 2 mirrors to gain a little bit of performance. By chance have you or anyone else had bad experience with these?

I believe the MX500's have a known bug with SMART reporting that causes them to always report one pending uncorrectable sector, and then resolve it shortly afterwards. There's no fix for it, just a parameter to the SMART test that says "ignore that field" - but it ignores it entirely, so if it ever reports a genuine uncorrectable sector (or more than one) you wouldn't ever know.

Is there anything particular to look out for when trying to find SSDs to put in a pool like this?

Generally speaking you want to look for "sustained write performance" which is sometimes tested for by review sites. Consumer grade drives using TLC or QLC NAND can often ingest a burst of writes at "full speed" by using some of their NAND in a pseudo-SLC mode, which then they later "fold" into the TLC or QLC blocks. If you exhaust that write buffer, or if the drive firmware doesn't consider itself to have enough "free time" to do housekeeping with a slow-but-steady stream of writes coming in, then you eventually slam painfully into the threshold. As an example of a bad drive, the Samsung QVO drives can handle 500MB/s until they run out of buffer space, and then get throttled down to as low as 80MB/s (depending on the drive capacity) - slower sequential writes than a spinning disk.

Tom's Hardware has a nice graph from their 870 QVO review that demonstrates this - and also shows the WD Blue, WD Red, and a few other SATA SSDs there (notably not suffering the same effects)

toms hardware 870 qvo sustained write graph 2.jpg


Ref: https://www.tomshardware.com/reviews/samsung-870-qvo-sata-ssd/2
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
OK. Should make the TRIM behind LSI HBA a non-issue then, but if the drives came straight from a consumer workload and never got a chance to be hit with proper secure-erase/TRIM (as I think pool autotrim is disabled by default in 12.x and later) the internal firmware could still be confused as to whether or not it needs to care about what exists on the NAND now.
Currently Auto TRIM is off. Should this option be turned on?
1663955899890.png
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
In your case specifically, I wouldn't.

TRIM was disabled by default in TrueNAS 12.0 due to a change in OpenZFS TRIM behaviour where async or "queued" TRIM is now used - and if I recall correctly, there was a rather large amount of back-and-forth between Samsung and the Linux kernel developers as to whose fault it was that certain Samsung models (it appears to be the 860 and beyond?) would falsely represent themselves as having queued TRIM support, when in reality they would TRIM the wrong LBAs.

I would instead connect the drives to a Windows PC, use the Samsung Magician software to do a full TRIM/secure erase, and then rebuild the pool.
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
I also saw something about write amplification. Would this cause an issue with lifespan by putting the SSDs in a 2 vdev by 2 mirrors per vdev?
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
I also saw something about write amplification. Would this cause an issue with lifespan by putting the SSDs in a 2 vdev by 2 mirrors per vdev?
I read into this more and it was more of an issue with firmware to the MX500s. Seems like the vernal consensus is those drive firmwares still aren’t fixed and causing issues in TrueNAS. So while it may be “fixed” in the sense it ignores the false errors, the drives firmwares with the errors still cause it to continue to just write to itself because of it causing early failures.
 
Top