Deduplication special devices or not?

ThisTruenasUser

Dabbler
Joined
Apr 19, 2023
Messages
44
Hi all.
So I have truenas scale virtualised in proxmox.
It has 5 x 4TB drives in RAIDZ1 with 2 x 16G intel optane as mirrored dedup devices.

There are 2 x ISCSI sparse drives, block size 1M of 12TB in size. They are for storing many games. Formatted with 1MB on windows.

It is working well and as expected. DEDUP value is 2.03.


With this command in the shell:
admin@truenas[~]$ sudo zpool status -Dv

The appropriate output: dedup: DDT entries 6761148, size 950B on disk, 306B in core

That is using 6GB of the mirrored drives + 2GB of memory needed.

It is not exactly going to blow up the machine.

After a year of many updates to those ISCSI 'drives' , the dedup usage data seems constant. The usage remains steady as I have about 6.5 TB of games.

I have been through this already.


So it seems for dedup to play nice, 1 megabyte record and block sizes are the way to go. That is precisely what I am doing.

So the new setup.
Buy 1 more 4TB or maybe 8TB drive.
Setup RaidZ2 - 16TB logical storage.
This is for more redundancy.
Create those ISCSI drives - maybe 32TB sparse drives - with the 1MB block sizes a dedup on.
Create my cifs & nfs shares.
Connect to the share devices and copy over from the BACKUPS I WILL DEFINITELY MAKE.
Those optane drives will be used for mirrored radz1 boot drives for proxmox. Also for more redundancy.
Replace the truenas pool drives when needed, probably with 8TB ones.

So deduplication needs to use the memory instead, if no special dedup devices are needed?


I understand without special dedup devices, the 6GB of storage on those optane drives will need an extra 6GB memory?

Presently when logging in, the has 16GB - services 3.3GB - zfs cache 7.8GB - 3.5GB free.
Apparently it is not struggling.
So allocate another 8GB RAM ?
The host machine has 64GB, so not a big issue.
I could allocate more ram if needed, but doubt it will be necessary for a while.
I could login & monotor how the memory allocation is going from time to time.

Oh and for those dedup 'haters' suggesting not to use it - please provide actual tested evidence why my setup cannot!
I am not a noob to truenas & dedup. Also do not claim to be an expert.

Useful and informed information is appreciated.

Thanks
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
It's not a matter of being haters. You can find plenty of threads explaining why you shouldn't use dedup, as well as why you shouldn't use RAIDZ for iSCSI... one of them is your own thread.

Another example I strongly raccomend reading given its prime educational content.

Just don't use deduplication: you are basically not benefitting at all from it and are instead creating another potential point of failure in the system using dedup VDEVs.
That being said, if you really want to use it at least wait until fast dedup comes out.

Also, please use [CODE][/CODE] tags when you paste terminal output.​
 
Last edited:

ThisTruenasUser

Dabbler
Joined
Apr 19, 2023
Messages
44
It's not a matter of being haters. You can find plenty of threads explaining why you shouldn't use dedup, as well as why you shouldn't use RAIDZ for iSCSI... one of them is your own thread.

Another example I strongly raccomend reading given its prime educational content.

Just don't use deduplication: you are basically not benefitting at all from it and are instead creating another potential point of failure in the system using dedup VDEVs.
That being said, if you really want to use it at least wait until fast dedup comes out.

Also, please use [CODE][/CODE] tags when you paste terminal output.​
It is all coming back.
I recall some posts on this forum about a year ago.
A multitude of negativity about 'not using deduplication'

Oh - and yet a manged to get it working - and working well.

I am using RAIDZ for ISCSI - It is working well. I will continue to use it.
I expect the recommendation is to waste a lot of cash on unnecessary storage. Not happening.

Now I am using those VDEVS and they are still great. I want to use for something else.
Those 16GB optane are very cheap and very resiliant.
If one fails, can be replaced for very few $$.

Your comment: 'Just don't use deduplication: you are basically not benefitting at all is bat all from it and are instead creating another potential point of failure in the system using dedup VDEVs' is curious.


I use much much less disk space with deduplication - so the cost of drives.
Combined with this in a proxmox container: https://lancache.net/, help with my game updates.

It saves me a lot of cash ongoing as I just need slow basic internet service.


The post a about file deletions is interesting. It is not really an issue for meit seems.
I use a 1/2 TB NVME and this: https://www.romexsoftware.com/en-us/primo-cache/

I expect with deferred writes, I do not even notice.
Game updates can sometimes be slower than expected. I expect that is sue to update clients like Steam, Epic Games and such.
There are some complaints about those in various forums.

Your reply mostly makes o sense to me.



I thank you for letting me know about the fast dedup. I am keen to learn more about that.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Oh - and yet a manged to get it working - and working well.
This is probably because your setup is actually not that demanding. iSCSI is a resource hog, which works best on mirrors with small blocks. Dedup is a memory hog. But you're using it to store static game images (no rewrites, no fragmentation on your raidz), and I suppose you're the sole client…
So it works acceptably for you. That still does not make it a setup which we would recommend to others—especially for a meagre dedup factor of 2.

To do without dedup vdev in your new setup you can ether allocate more memory so the DDT can reside in RAM or use a persistent L2ARC for metadata (this can be a single drive). If the server is always on, using RAM is the easiest solution.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I use much much less disk space with deduplication - so the cost of drives.

It saves me a lot of cash ongoing as I just need slow basic internet service.
Not having the full DDT output of the zpool status -D command gives us little information about your DDT besides the fact that it looks small.

If you go RAIDZ2 and DDT VDEVs you should throw in another optain to the two-way mirror in order to match the pool's parity.
If you want to use the optanes for a reliable boot pool instead, read this resource.

It is all coming back.
I recall some posts on this forum about a year ago.
A multitude of negativity about 'not using deduplication'

Oh - and yet a manged to get it working - and working well.

I am using RAIDZ for ISCSI - It is working well. I will continue to use it.
I expect the recommendation is to waste a lot of cash on unnecessary storage. Not happening.
Your system your choice, just beware of that line of thought: quite a few users got burned believing their system was working fine... until it didn't. Not saying this is necessarily your case.
 

ThisTruenasUser

Dabbler
Joined
Apr 19, 2023
Messages
44
I am keeping the setup for now & waiting to see how the fast dedup plays out.

I have another old machine, which has truenas scale as the OS. It is used to store backups - or at least one of them.
It uses dedup, but not any optane as dedup devices. 1MB record sizes of course.
I recall when I sync backup, can be slow, with varying speeds.

So I went nuts and ordered a few more 16GB optane drives. I will destroy that 'backup' pool and recreate using 2 in mirroed vdev.
The are cheap and useful.


This is the info for the main machine ( not the backup one)
Code:
admin@truenas[~]$ sudo zpool status   -D
[sudo] password for admin:
  pool: Main-Pool
 state: ONLINE
  scan: scrub repaired 0B in 07:45:46 with 0 errors on Sun Mar 31 07:45:48 2024
config:

    NAME                                      STATE     READ WRITE CKSUM
    Main-Pool                                 ONLINE       0     0     0
      raidz1-0                                ONLINE       0     0     0
        439d52b5-7141-4607-ba41-e7b814039b4e  ONLINE       0     0     0
        94c2caff-ed31-40b3-bd34-0a0ecee46b62  ONLINE       0     0     0
        897a35d4-f6f9-4370-bd8a-5df84b8670d1  ONLINE       0     0     0
        ce546fb1-c7c3-4ff1-b6da-0de13e53e74b  ONLINE       0     0     0
        f28a79fd-9046-4bad-b4ee-8b506bfffd9b  ONLINE       0     0     0
    dedup   
      mirror-1                                ONLINE       0     0     0
        0f12fd17-752e-494c-a976-e8fb20c207e7  ONLINE       0     0     0
        4389d266-b292-47f9-9d08-1d38b54ba224  ONLINE       0     0     0

errors: No known data errors

 dedup: DDT entries 6759219, size 950B on disk, 307B in core

bucket              allocated                       referenced         
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1     432K    432G    333G    333G     432K    432G    333G    333G
     2    5.80M   5.80T   4.60T   4.60T    11.7M   11.7T   9.31T   9.31T
     4     209K    209G    173G    173G     853K    853G    701G    700G
     8    14.6K   14.6G   3.59G   3.63G     145K    145G   34.2G   34.5G
    16    2.04K   2.04G    882M    886M    41.6K   41.6G   17.8G   17.8G
    32      528    528M    238M    239M    20.8K   20.8G   8.99G   9.02G
    64       69     69M   15.0M   15.2M    5.66K   5.66G   1.15G   1.17G
   128       51     51M   6.18M   6.37M    7.72K   7.72G   1.07G   1.10G
   256       13     13M   3.72M   3.76M    4.58K   4.58G   1.24G   1.25G
   512        4      4M     32K   51.1K    3.03K   3.03G   24.3M   38.8M
    1K        1      1M      8K   12.8K    1.04K   1.04G   8.34M   13.3M
 Total    6.45M   6.45T   5.10T   5.10T    13.2M   13.2T   10.4T   10.4T


  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
    The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:08 with 0 errors on Tue Apr  2 03:45:09 2024
config:

    NAME        STATE     READ WRITE CKSUM
    boot-pool   ONLINE       0     0     0
      sdb3      ONLINE       0     0     0

errors: No known data errors

 dedup: no DDT entries
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
PLEASE use [CODE][/CODE] tags!!

If my poor eyes do not deceive me, your DDT table is saving you around 5.3TB of space: of this, around 4,71TB is data dupluicated only once (meaning it's stored only twice). I would say it's a not insignificant amount given your pool's estimated total size.

Run your numbers but I would say that given you already have the drives it costs you less to continue using a DDT VDEV. At least until fast dedup lands.​
 
Last edited:

ThisTruenasUser

Dabbler
Joined
Apr 19, 2023
Messages
44
This is probably because your setup is actually not that demanding. iSCSI is a resource hog, which works best on mirrors with small blocks. Dedup is a memory hog. But you're using it to store static game images (no rewrites, no fragmentation on your raidz), and I suppose you're the sole client…
So it works acceptably for you. That still does not make it a setup which we would recommend to others—especially for a meagre dedup factor of 2.

To do without dedup vdev in your new setup you can ether allocate more memory so the DDT can reside in RAM or use a persistent L2ARC for metadata (this can be a single drive). If the server is always on, using RAM is the easiest solution.

Curious.
Firstly - this is for games. Stored on the NAS, and backup up to another machine.

I suggest you look at the initial post claiming it is memory hog. It apparently uses 2GB. That is the main benefit of the 1MB block/record sizes.

As for using ISCSi, do you suggest I use windows shares instead to store games. I sincerely hope not.
Perhaps buy 2 or 3 8TB NVME drives to store games instead. If you wish to pay for them for me, then would be grateful.
Using a NAS and buying 2 x 16GB intel optane was maybe a cheaper option. I just ordered a few more They are about $5 US each mow I think.

Good for vdev devices, boot drives in s home server.
When fast dedup is available and tested, may not be needed as dedup vdevs .

The caching solution on the gaming machine is this: https://www.romexsoftware.com/en-us/primo-cache/ - whopping $30 for the licence.
It caches block devices. Typically a SSD caching a hard drive or an slowish block device.
Combined with this: lancache.net & a low spec windows WM (connected to one of the ISCSI), that updates games, all is fast.


The meagre dedup factor of 2 is slightly better that I expect. So about 2 x 6TB of stored games data, I think compressed to 4.5TB in total.
When time allows, possibly serup linux gaming. nfs may be the best option, but still 1MB block size & dedup turned on.

So I have all my games downloaded, updated very fast and ready to play when I wish. Also backup up regularly to another machine.

Still there is ongoing negativity about why I cant do this.

I am thinking of adding an L2arc also. Again with the 1MB block sizes, doubt it will hog memory either.
 
Top