Homelab build proxmox/k8s nfs storage heavy write

z-lf

Cadet
Joined
Jan 20, 2023
Messages
6
Hi lovely people,

I am moving my homelab storage to truenas CORE. Not scale because I want to force myself not to use docker directly on truenas.
the setup is as follow:
  1. 3 node cluster in proxmox.
  2. Nfs as storage backend for VMs and backups.
  3. And I use HA.
  4. Truenas is on a 8 bay machine.
  5. everything is on a 1gbe network but truenas has lacp aggregation so 2x1gbe.
  6. all my services in k8s will use nfs storage class
I am into monitoring a lot (learning for work) so a lot of the services will be things like influxdb, prometheus, graylog, security onion, maybe splunk or elk, etc.
None of the data will be critical. It would suck to redo everything but not really top prio (especially when considering cost)

so here is what I was thinking:
  1. 1 vdev with 3x4TB and a hot spare. This would hold vms, backups and long term aggregated data from the monitoring
  2. 1 vdev of 4x256GB (samsung evo 870). This would be for data ingest, with retention policy adequate to the space available)
I already have the 4x4TB HDDs. But the rest is completely open..

what do you guys think?

thanks a lot.
(and if you are in berlin, beer’s on me :)
 

z-lf

Cadet
Joined
Jan 20, 2023
Messages
6
Oh and there is a nvme for the os. so that’s not part of the 8 bays.
 

z-lf

Cadet
Joined
Jan 20, 2023
Messages
6
I also forgot the following:
  1. 32gb of ram
  2. the ethernet links will upgrade to 2x2.5Gbe.
  3. there are a between 15 and 20 devices that will send metrics/logs. (home assistant, router, switches, computers, laptops, other nas)
  4. and around 50 services but Incan tweak the timings to fit the capacity.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
1. You should be using mirrors for VM Storage. Preferably SSD's. Mirrors = IOPS.
2. NFS write are sync writes which means your performance will be poor. Either set sync=disabled (at a data risk) or get a proper SLOG
3. Be prepared to want more memory
 

z-lf

Cadet
Joined
Jan 20, 2023
Messages
6
@NugentS thanks a lot for the feedback.
would this setup make more sense:
  1. 4 ssd in raid10 with the nvme as slog.
  2. a usb attached ssd for boot
  3. 4 HDD raidz1 + hotspare
the raid 10 for vms and ingest
the raidz1 for backup and archive.

what do you think?
my board is limited to 32gb ram. I maxed it out already. (Terramaster u8-423)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
You want to support TWO pools, one doing VM block storage, on 32GB? Yeesh.

 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Yeah - he is pushing it a bit

What NVMe - a SLOG has specific requirements that an EVO 870 (if thats what you are thinking) will not achieve. Think Optane 900p or better (mirrored if this is a commercial device, or single if its just for play)

4 SSD's in Mirrors, striped across 3 vdevs will be a lot better for block storage than Z1. You will really want a proper SLOG or run sync=disabled which is not datasafe

32GB is likely to cause somewhat non-optimal performance - although the good news is that you are running on 1 Gb (ish), rather than 10 or 40. Reliability of the 2.5 - well you will find that out for yourself (ie YMMV)

I personally would prefer 64GB+ - but if 32 is your limit then so be it.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
32GB is likely to cause somewhat non-optimal performance

I don't even know how you can trivialize it like that. With two pools, there will be contention for ARC, and ARC is incredibly important for block storage. You could think of it at "only 16GB ARC for the block storage" and probably not be that far off base.

 

z-lf

Cadet
Joined
Jan 20, 2023
Messages
6
@NugentS after your first message I was thinking:
  1. slog -> m.2 optane (maybe not the 905p as it is too expansive, but maybe the 4801x ? and I can't dual because I only have 1 nvme slot.
  2. 4xEVO 870 for the raid10 (this pool is where the vm would run)
  3. 4x4tb HDD raidz1 (this pool is strictly for backup.
@jgreco I am new to zfs so I might have understood this wrong, I thought the rule of thumb for ram was 1gb of ram for 1TB data.
I don't intend this machine to grow passed 32 TB. (I don't think it will grow above 20)
Would you say maybe I should stick with the HDDs and make:
6x4tb HDDs raid10 and add slog, ssd cache, keep everything between 10% and 50% use and call it a day?

Also, this is for a homelab, it doesn't have to be perfect, I just want to avoid a stupid setup.

Thanks again for the feedback. It's much appreciated !
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I thought the rule of thumb for ram was 1gb of ram for 1TB data.

Sure, for basic filesharing duties. However, as explained in the resources I've linked for you, iSCSI or NFS block storage is not "basic filesharing".
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
1. 4801x would make a fine SLOG - actually better than the 905p. Less of it - but for a SLOG that doesn't matter as you won't need much

2. Samsung make good drives, better than most and these are not QVO drives which are basically landfill. They are however comsumer grade SSD's which do not as a general rule of thumb perform as well as advertised as generally they don't advertise random IO specs. There is a reason why business buys enterprise SSD's and its not because they love spending money. Having said that these are for home use and you won't (for the money) do better. Just don't expect miracles

3. Backup - err OK. Its not really a backup if its on the same machine but other than that OK.

Memory. Our resident Grinch is not wrong, but remember where he comes from and what he does for a living (other than moan at us for bad decisions, which he does so much you might think its his day job). I say however, he is right (please Mr Grinch don't hit me). Ideally you would be running the working load from RAM (ARC) with the rest read in where required. In your case that won't happen and you will put a lot of load onto the disks which are slow in comparison. When you run "backup" you will tend to evict useful data from ARC (depending on how the ARC actually works - magic) which will then have to be read back in. What you might try, as speed of backups is rarely important, is "zfs set cache=metadata backuppool" which should limit how much ARC is used by the backup pool, prioritising the NVMe pool. YMMV and I would benchmark and test first for acceptable performance before trying that or the even more severe "zfs set cache=none backuppool" which I think would make the backup pool run like a dead 3-legged dog thats been buried for a year or so.

Memory is likely to be your major issue here - and as you say - you can't do anything about it
 
Last edited:

z-lf

Cadet
Joined
Jan 20, 2023
Messages
6
Thanks a lot guys. Truly appreciated, and I understand the grinch-ness, since you have these questions a lot.
I did read some white paper, and I thought I understood how things worked :D. But I totally did not see the RAM issue.

I am going back to my drawing board. The setup I want will be too expansive unfortunately.
And after this conversation, maybe I don't need a cluster for my homelab.
  1. if I have 1 machine with all my services and it fails: everything fails.
  2. if truenas fails in my cluster: everything fails.
@NugentS on your supermicro server, I see you have 2 pools, one with HDDs and one with SSDs.
Where would you store the data for graylog/prometheus/influxdb etc. Not the VM storage, just the data as NFS?
Would everything go onto the SSD and you archive to the HDDs?
I think maybe I need to go into that direction.

Thank you !
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
As a general principle I store as little as possible on VM's. If a VM needs storage then it stores it, not in the VM, but on a shared dataset on the NAS. If something really needs low latency, high IOPS storage then I would put it on SSD's. (I also have a single SSD pool for scratch low latency data)
My Pools are:
AppPool - for storage of docker data and related data such as config folders. Also on Server VM ZVOL's. This is a pair of mixed mode enterprise SSD's
BigPool - Bulk Storage - but mirrors for increased IOPS, included special for small files on certain datasets, L2ARC (kinda redundant) & SLOG. All bulk data goes here.
ScratchSSD - used for transcoding folders and similar. Single SSD, no redundancy. Who cares if it breaks
SSDPool - Mixed mode enterprise SSD's for VM's for ESXi. Multiple mirrors for IOPS. Exists soley for supplying ESXi with storage. Added SLOG
[Note all the SSD's, with the exception of the SLOG's are second hand from ebay]. I have had one fail so far - total failure - disk no longer recognised.

My SSDPool is sperate from my App Pool because they are different sizes and I didn't think of adding them together as an unbalanced pool. If I was to have to rebuild, I would probably combine SSDPool & AppPool, stay with mirrors and just have slightly unbalanced data vdevs. More vdevs, more IOPS

The answer to your question is how do you use the data, and how much data. A database (what I believe you are talking about) if used efficiently is / or should be reasonably small so I would run it on the SSD's together with the container or VM as appropriate. If the db was large and inefficient I would store it on the HDD's, in its own dataset, via NFS or SMB tuned for small record sizes with access from whatever app variety was using the data. Actual logging data, which is then ingested into the database would probably be a dataset on the ScratchSSD to avoid write amplification on the SSD's
 
Top