Fusion pool metadata vs Plex metadata - Optimal SSD usage

gobygoby · Aug 21, 2021

I recently purchased two SSDs and I am looking for advice on the best way to use them to increase performance on my TrueNAS system.

I am mainly using the system as a Plex Media Server. I've been researching this for several days and I believe I found 4 good uses for the SSDs but I do not know which option would be best.

Current setup:

Boot: 128GB mSata SSD

Storage:
mirror0
-6TB Sata
-4TB Sata
mirror1
-5TB Sata
-5TB Sata
mirror2
-4TB Sata
-4TB Sata
spare
-10TB SAS
special
mirror3
-16GB m.2 Sata
-16GB m.2 Sata

The 2 new SSDs are 200GB SAS SSDs (MZ6ER200HAGM)

Option #1:
Upgrade my current Fusion Pool
My m.2 SSDs are older and they are only 16GB each. If I replaced them with the SAS SSDs, I believe I could configure them to store small files as well as the ZFS metadata

Option #2:
Create a new SSD only pool, move Plex metadata
As I understand it, I have two sets of metadata. ZFS metadata which information about the files in the pool and Plex metadata which are things like posters, thumbnails, intro sound bites, etc. Moving the Plex metadata to SSDs should make navigating the Plex app faster (more responsive, snappier?)

Option #3:
Create a new SSD only pool, move entire Plex jail
Instead of only moving Plex metadata, perhaps moving the entire Plex jail would be better. This sounds more complicated than option #2, but others on this forum have done it from what I've read.

Option #4:
Add L2ARC
I am maxed out at 32GB of RAM. I don't know how much benefit I would see from adding an L2ARC, but maybe it would help? Or maybe I combine this with option #1 above, use the new SSDs for the Fusion upgrade and then use the 16GB m.2s for L2ARC.

I don't feel like this is the best option, but you guys are the experts here so I would love to hear your thoughts. If you have any other ideas not covered above, I would love to hear those too.

Thanks!

NugentS · Aug 21, 2021

When you added the special mirror - did you carefully select the metadata size on the datasets or just leave it at whatever the default was?
Do you know how much metadata is stored on the special vdev?
Can we see an output from zpool iostat -v "PoolName"?
Also what is the Metadata blocksize set at on the various datasets?

Option 1: Maybe
Option 2: Maybe
Option 3: Maybe - where is the media data and where is the plex metadata stored?
Option 4: Probably not. Use of the SSD's as L2ARC will use up some of your ARC.

gobygoby · Aug 21, 2021

Thank you NugentS for responding, and thank you for the questions, they help point me in the right direction.

"did you carefully select the metadata size on the datasets"
- I attempted to do this by following the instructions on this page:
  
  ZFS Metadata Special Device: Z
  
  Introduction ZFS Allocation Classes: It isn’t storage tiers or caching, but gosh darn it, you can really REALLY speed up your zfs pool. From the manual: Special Allocation Class The allocations in the special class are dedicated to specific block types. By default this includes all metadata...
  
  forum.level1techs.com
- but I didn't understand all of it, so the metadata size is probably not optimal.

"Do you know how much metadata is stored on the special vdev?"
- no, but I would love know how to check this

"Can we see an output from zpool iostat"
- Yes, here it is:

capacity operations bandwidth
pool alloc free read write read write
---------------------------------------------- ----- ----- ----- ----- ----- -----
fishtank 6.73T 5.06T 2 27 305K 470K
mirror 1.99T 2.54T 0 1 90.7K 81.6K
gptid/fe5b6ff5-c2d8-11ea-893c-9c5c8ebfa54f - - 0 0 45.4K 40.8K
gptid/ff1d1bed-c2d8-11ea-893c-9c5c8ebfa54f - - 0 0 45.3K 40.8K
mirror 2.21T 1.42T 0 1 113K 56.6K
gptid/fe54b807-c2d8-11ea-893c-9c5c8ebfa54f - - 0 0 57.0K 28.3K
gptid/fe7aaaad-c2d8-11ea-893c-9c5c8ebfa54f - - 0 0 56.1K 28.3K
mirror 2.53T 1.09T 0 1 101K 47.5K
gptid/ca2f7dfb-fe24-11eb-b0cf-989096a2dc39 - - 0 0 45.1K 23.7K
gptid/3b9a0c98-5fdb-11eb-a20f-9c5c8ebfa54f - - 0 0 55.6K 23.7K
special - - - - - -
mirror 2.12G 12.4G 0 23 675 285K
gptid/6766fdb6-600f-11eb-a20f-9c5c8ebfa54f - - 0 11 348 142K
gptid/678e5e86-600f-11eb-a20f-9c5c8ebfa54f - - 0 11 327 142K
---------------------------------------------- ----- ----- ----- ----- ----- -----

I am not familiar with the iostat command, so after I post this reply I will go read up on it.

"What is the Metadata blocksize set at on the various datasets?"
- I do not know, how do I check this?

gobygoby · Aug 21, 2021

Here is a screenshot of the table that should be easier to read:

NugentS · Aug 22, 2021

Your Special device is 12.4G allocated, with 2.11G used.

For each dataset look at options. At the bottom there is a metadata (special) small block size which you can change. Be aware that changes are not retrospective ie only effects new data written to the dataset. When I put my special vdev in I had to "churn" all the data in the pool to get the special vdev populated. If your special runs out of space (block size too large) then it just stores extra data on the main HDD's (ie no gain). If the block size is too small then it ignores the special vdev.

Personally I don't like hot spares (your 10TB Drive) as they spin and generate wear and tear for no purpose (until actually used). YMMV of course. I would be using the drive, with another 10TB as another pool and possibly snapshot all the data on a regular basis from Pool A to Pool B as a backup (although some would say not a real backup as its in the same machine)

gobygoby · Sep 8, 2021

NugentS said:
Personally I don't like hot spares (your 10TB Drive) as they spin and generate wear and tear for no purpose (until actually used). YMMV of course. I would be using the drive, with another 10TB as another pool and possibly snapshot all the data on a regular basis from Pool A to Pool B as a backup (although some would say not a real backup as its in the same machine)

I like your suggestion of making a snapshot on the 10TB drive.

When I started out using FreeNAS, the thing that attracted me to it the most was that I could create VDEVs using drives of any size. At the time I was kinda poor so all my drives were old, used, and/or low quality. I typically use drives until they die, which is why I have always used a hot spare and on several occasions, a drive would die, the spare would jump in automatically, and I've never lost my data.

But now that I'm starting to buy higher quality SAS drives, I may want to rethink my approach...

Etorix · Sep 8, 2021

The single-drive snapshot would make most sense if the drive were in a removable enclosure so it could be kept as offline (and possible off-site) backup.
If you expect the drives to fail one after the other, having a hot spare does make sense.

Some observations from an onlooker who doesn't use Plex:
#1 would take a full rewrite of the pool to populate the upgraded vdev.
L2ARC is useful for repeatedly serving the same files. This probably does not occur for the large media files, and the small metadata is already served from fast SSD storage, so no benefit is to be expected from #4. L2ARC consumes ARC (RAM) space, so 32 GB L2ARC for 32 GB RAM is possibly too small to be useful, while 400 GB would be too large (the usual rule of thumb is that L2ARC should not exceed 5*RAM).

For media storage, raidz2 would be more efficient than mirrors. With your current six drives, you could have a 6-wide raidz2 with 4*4=16 TB of usable space, compared with 4+5+4=13 TB as mirrors, and then increase size by replacing drives with larger ones instead of adding extra vdevs. A hot spare would not be needed if you could replace a failing drive fast enough.
The 10 TB could serve as backup while recreating the pool, though data would be at risk while on non-redundant storage. With a second 10 TB, I would certainly take option #3 and then recreate the media pool as raidz2 with a 10 TB mirror as temporary storage—then, either setup a second NAS for backup or replace the older 4 TB drives with the 10 TB.

NugentS · Sep 8, 2021

wot @Etorix says. You don't need massive IOPS with Plex - just capacity

gobygoby · Sep 8, 2021

Thanks for the suggestions, I think I have an idea what to do now:

Add both 200GB SSDs to the special VDEV as a 4 way mirror
Remove the two 16GB SSDs from the special VDEV (essential swapping out the smaller older SSDs for the nicer, newer, larger ones.)
Change the metadata/special VDEV settings so that more files are stored on the SSDs

I tried to do a "REPLACE" on one of the 16GB SSDs and it did not go well. I OFFLINED the drive, that didn't help. Then I detached the 16GB SSD, which I then learned I should not have done. Luckily I was able to extend the special VDEV back onto the 16GB drive I detached and it resilvered fine.

So I'm back where I started, with two 16GB SSDs as the special VDEV. I tried to extend onto the 200GB SSD, but I got this error message:

Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 367, in run
await self.future
File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 398, in __run_body
rv = await self.middleware._call_worker(self.method_name, *self.args, job={'id': self.id})
File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1219, in _call_worker
return await self.run_in_proc(main_worker, name, args, job)
File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1146, in run_in_proc
return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1120, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
middlewared.service_exception.CallError: [EZFS_BADTARGET] can only attach to mirrors and top-level disks

NugentS · Sep 8, 2021

Its just a mirrored vdev - you ought to be able to replace a disk

Add both 200GB SSDs to the special VDEV as a 4 way mirror - err I don't think so. You want to replace a smaller disk with a larger one, as if the small disk has gone bad - more your next point
Remove the two 16GB SSDs from the special VDEV (essential swapping out the smaller older SSDs for the nicer, newer, larger ones.). I agree - this ought to work
Change the metadata/special VDEV settings so that more files are stored on the SSDs - yup

Not that I have ever had to do this yet with a special. I did swap a disk out a and it was a fairly simple process

TrueNAS 12: Replacing Failed Drives

Connecting With Us--------------------------------------------------- + Hire Us For A Project: https://lawrencesystems.com/hire-us/+ Tom Twitter https://...

www.youtube.com

gobygoby · Sep 8, 2021

I used the GUI and went to Storage/Pools/Pool Status. I selected one of the 'special' 16GB SSDs and set it to OFFLINE. I then tried to REPLACE the drive and I got this error:

Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 277, in replace
target.replace(newvdev)
File "libzfs.pyx", line 391, in libzfs.ZFS.__exit__
File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 277, in replace
target.replace(newvdev)
File "libzfs.pyx", line 2060, in libzfs.ZFSVdev.replace
libzfs.ZFSException: already in replacing/spare config; wait for completion or use 'zpool detach'

gobygoby · Sep 8, 2021

I was able to replace the disk using the command line and following the instructions here:

Cannot replace disks in pool

I am currently running TrueNAS core 12.0-u2.1. My Setup has 3 pools and a boot volume. 1 of my pools is a mirrored vdev of 2 small SSDs used for virtualization (we'll call this pool 1 to keep things straight). The other two are larger with each pool containing one raidz1 vdev with 4 spinning...

www.truenas.com

The REPLACE command only worked when I inserted "-o ashift=9"

gobygoby · Sep 8, 2021

After replacing both 16GB SSDs, I think the special VDEV autogrew to 186GB:

The resilvering process was so much faster on these newer SAS SSDs, I'm really excited to start using them.

Next I need to read up on how to store more small files on them.

NugentS · Sep 9, 2021

you need to "churn" the dataset
For instance - I have a dataset called Archive. I use the code below to "churn" it to a new DataSet called Archive_New and then rename it back again

I am assuming spare capacity on the pool.

Code:

zfs snap BigPool/SMB/Archive@migrate
zfs send -R BigPool/SMB/Archive@migrate | zfs recv -F BigPool/SMB/Archive_New
zfs snap BigPool/SMB/Archive@migrate2
zfs send -i @migrate BigPool/SMB/Archive@migrate2 | zfs recv -F BigPool/SMB/Archive_New
zfs destroy -rf BigPool/SMB/Archive
zfs rename -f BigPool/SMB/Archive_New BigPool/SMB/Archive

This does:

A Snapshot of the pool
Copies the snapshot to a new dataset called Archive_New. This writes all the data to the pool and deals with the metadata appropriately
Makes a second snapshot
Sends the second snaphot to Archive_New
Destroys Archive
Renames Archive)New to Archive

gobygoby · Sep 9, 2021

Thank you for the churning instructions, the last time I did this I just used Midnight Commander.

Do the files need to be moved out of the dataset, or could you just make a second copy (in a new folder in the original dataset) and then delete the original files?

Also, before I churn, I think I need to do some resizing, but I'm not exactly sure on the procedure.

But before I get into that, when I do a ZPOOL STATUS, I get a message I've never seen before:

special
mirror-3 ONLINE 0 0 0
gptid/20c4aaab... ONLINE 0 0 0 block size: 512B configured, 8192B native
gptid/070b08c8... ONLINE 0 0 0 block size: 512B configured, 8192B native

Should I change the SSD block size back to 8192B?

Second, I think I need to set a different size so that more small files are moved to the special vdev.

A month ago I ran across a tutorial on how to set this but I forgot to bookmark the page. I think the first step of that process was to analyze the pool, and I think this is the command, please correct me if I'm wrong:

Etorix · Sep 9, 2021

gobygoby said:
The REPLACE command only worked when I inserted "-o ashift=9"

This means your old SSD used 512 bit sectors, which are not be optimal for the new SDD. Ashifts of 12 or 13 are typical and/or recommended for modern SSDs. Your SSDs want ashift=13 (8192).

Since you only use mirrors for now, you should be able to remove the special vdev (copying everything back to the "data" vdevs) and then recreate it with the right ashift.
It is also generally better to use the GUI, as this guarantees that the middleware correctly registers all changes to the pool.

gobygoby · Sep 9, 2021

I tried using the GUI first, but I couldn't get past the "already in replacing/spare config; wait for completion or use 'zpool detach'" error.

How do I remove the special vdev? I thought that:

"Drives added to a metadata vdev cannot be removed from the pool. "

Fusion Pools

Describes how to create a Fusion Pool on TrueNAS CORE.

www.truenas.com

NugentS · Sep 9, 2021

you linked to the page I used on the 3rd post of this thread

Oh and you can potentially remove a special vdev if the pool is mirrored (YMMV)

gobygoby · Sep 9, 2021

I tried to remove the special VDEV and got this error:

invalid config; all top-level vdevs must have the same sector size and not be raidz.

I'm not using raidz so it is probably a sector size issue.

Ok then, so I guess it's time to rebuild this pool from scratch. I already created a second pool and made a copy of all my files so I shouldn't lose any data.

But before I get started I have a couple questions:

How do I format the new SSDs to use a larger bit sector size?
Should I set the SSD sector size to 4K so that all the disks are the same, or 8K because that is the native size?
Because my files are almost all media files, should I set the dataset Record Size to 1MiB?
What should I set the Metadata (Special) Small Block Size to?

Thank you all for you help so far, I've learned a lot and I really appreciate your time and assistance.

NugentS · Sep 9, 2021

"How do I format the new SSDs to use a larger bit sector size?"
"Should I set the SSD sector size to 4K so that all the disks are the same, or 8K because that is the native size?"
Not sure, I think you can manually set the ashift when creating the vdev. This is not something I can answer with any authority

"Because my files are almost all media files, should I set the dataset Record Size to 1MiB"
Probably - you might try different options. I think I used 512K for my movies. Also, each dataset can be set differently

"What should I set the Metadata (Special) Small Block Size to?"
Read the link you posted and work out how much in the way of small files will easily fit in say 50% of the special vdev. Then set the metadata small block size on a dataset and churn. Rinse and repeat till you are happy. This is not a question that we can answer as it will differ depending on your files and setup

Important Announcement for the TrueNAS Community.

Fusion pool metadata vs Plex metadata - Optimal SSD usage

Dabbler

MVP

Dabbler

Dabbler

MVP

Dabbler

Wizard

MVP

Dabbler

MVP

Dabbler

Dabbler

Dabbler

MVP

Dabbler

Wizard

Dabbler

MVP

Dabbler

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Fusion pool metadata vs Plex metadata - Optimal SSD usage"

Similar threads