BUILD Hardware selection for large warm storage server

ilmarmors

Dabbler
Joined
Dec 27, 2014
Messages
25
I'm looking for a most cost effective warm storage server for 200TB capacity in the beginning, 400TB should be enough for next 2-3 years. Path to expand further would be good thing to have. All the components must be purchased in EU. Buying used equipment is not really an option for this.

Purpose:
  • Storing master files of digitized assets - images, audio, video. File size mainly 10MB and up.
  • This won't be primary long term storage solution, there is other system that is responsible for long term storage of those assets. Long term storage has slow turnaround time and you can't retrieve individual object, just sets of objects (kind of Glacier).
  • This server would be used for storing additional copy of master files, that allows quickly access them, when needed (not frequently). Mainly serial write load when adding new files.
  • This server could have separate volume (or maybe pool) for smaller access copies, that can be recreated from master files any time. Used more often. Probably not more than 1/10 in size comparing to master files. Although there is options to host access files on separate server.
  • No deduping needed.
Requirements:
  • Bit rot safe storage. That's why I'm looking at ZFS.
  • 10Gbps or multiple 1Gbps links.
  • Should be able to read and write sequentially several MBps. Workstations have 1Gbps links, few of them would generate load at the same time at most.
  • Write load is sequential. Files once written will stay there forever.
  • Reads mainly on volume with access files - list all files in folder, look through them. Could be separate server not to mix different loads.
  • Must handle up to 100M files in total.
  • Rack-mountable case.
  • No video trans-coding, no jails or other CPU intensive tasks on servers
  • Part of files can be compressed, for example, uncompressed TIFFs (usually 1.1-1.5 times)
  • Shouldn't be electricity hog.
  • Easy maintenance and administration
I think single socket CPU should be fine, and such systems costs less and saves on operational costs too (less electricity, less heat). Operational costs W/TB would be better also for He filled big capacity disks. I looked around and currently most reasonable hardware selection for specific case I described above in my opinion could be:

SuperMicro SuperStorage Server 5049P-E1CTR36L with integrated X11SPH-nCTF motherboard:
  • Single Socket P (LGA 3647) supports Intel® Xeon® Scalable Processors
  • Up to 1TB 3DS ECC LRDIMM, up to DDR4-2666MHz; 8 DIMM slots
  • 1 PCI-E 3.0 x16, 1 PCI-E 3.0 x8, 1 PCI-E 3.0 x8, 1 PCI-E 3.0 x4 (in x8)
  • 2x 10GBase-T with Intel® X557
  • 36 Hot-swap 3.5" SAS/SATA drive bays with SES3 (24 front + 12 rear);
  • 4 Internal fixed 2.5" drive bays
  • 2 Hot-swap 2.5" NVMe/SATA (rear, optional)
  • SAS3 via Broadcom 3008 controller
  • Server remote management: IPMI 2.0 / KVM over LAN / Media over LAN
  • 7 High-performance 8cm PWM fans
  • 1200W Redundant Power Supplies Titanium Level (96%)
CPU:
Intel Xeon Silver 4114 - 10 cores, 2.20 GHz (3.00 GHz max turbo), 13.75MB L3 cache, TDP 85W
or cheaper
Intel Xeon Silver 4110 - 8 cores, 2.10 GHz (3.00 GHz max turbo), 11MB L3 cache, TDP 85W

Memory: 64GB ECC when 12 disks, 128GB when 36 disks, using 32GB modules

OS disks: 2 x SATA-DOM 32GB
There are 2 yellow SATA DOM ports on this motherboard.

L2ARC: 2 x Intel D3-S4610 480GB, SATA 6Gb/s, 3D, TLC 2.5", 7.0mm, 3DWPD
Could it help to speed up reading of most commonly used access files (there are hot and cold collections of access files)?
Probably needed only if access files are stored on the same server as master files. In case, if only master files are stored on this server, then it won't benefit much from L2ARC.

HDD options:
High capacity enterprise SATA 7200rpm HDDs - with longer warranty and 2.5M hours MTBF
WDC/HGST
HGST/WD 3.5" 12TB SATA 6Gb/s 7.2K RPM 256M 0F30144 512e (He12)
WDC/HGST 3.5"14TB SATA 6Gb/s7.2K RPM 512M 0F31284 512e (He14)
Toshiba MG07ACAxxx
Toshiba 3.5" 12TB,7.2K RPM,SATA 6Gb/s,256M,512e, Helium
Toshiba 3.5" 14TB,7.2K RPM,SATA 6Gb/s,256M,512e, Helium

I was recently quoted prices of HDDs above - 14TB WDC was quite a bit more expensive in EUR/TB, but Toshiba 14TB was comparable with 12TB in EUR/TB.
Any feedback on using Toshiba drives?

Path for further expansion:
One of the SuperMicro JBODs with lower cost per drive bay.

There is question about most appropriate pool(s) config. Not sure if it would be better to put colder master files and warmer access files into separate pools or separate volumes is enough. As I wrote above - putting warmer access files into separate server is option too.

Options for vdev(s) for pool could be:
  • 12 disk RAID-Z2 (max capacity)
  • 12 disk RAID-Z3 (lower capacity, better time to total data loss)
  • 11 disk RAID-Z3 (not sure if 8 data disks gives performance benefits you can feel)
  • 10 disk RAID-Z2 (not sure if 8 data disks gives performance benefits you can feel)
If there is separate pool (instead of server) for warmer access files, then that could be build from mirror vdevs. For example, 3 vdevs with 10 disks in RAID-Z2 for colder pool and 3 vdevs with 2 way mirrors for warmer pool. Not so good (lack of free disk bays) with other 11 and 12 disk RAID-Z configurations for cold pool.

Initially there is one system with 36 disk bays (+2 slots for 2.5"). When adding JBOD, that would be +24 or +44 - 60 or 80 disk bays in total.

Any suggestions and comments welcome.

Thanks,
Ilmārs
 
Last edited:

ilmarmors

Dabbler
Joined
Dec 27, 2014
Messages
25
Ended up with following configuration (came fully assambled from Supermicro) this summer of 2020:

1x SSG-5049P-E1CTR36L that comes with X11SPH-nCTF motherboard
1x P4X-CLX4214R-SRG1W - Intel Xeon Silver 4214R 12C/24T
4x MEM-DR432L-SL01-ER29 - 32GB DDR4-2933 2Rx4 LP ECC RDIMM (128GB RAM)
2x HDS-S2T1-MZ7LH240HAHQ05 - Samsung PM883 240GB SATA 6Gb/s V4 (OS disks in mirror)
1x MCP-220-84701-0N - SC847 internal drive tray for one 3.5"
1x MCP-220-82619-0N - 2.5x2 NVMe Drive Kit
2x CBL-SAST-0819 - OCuLink v 1.0,INT,PCIe NVMe SSD
1x HDS-IUN0-SSDPE2KE016T8OS - IntelOPAL D7-P4610 1.6T NVMePCIe3.1x4 3DTLC2.5"15mm3DWPD (L2ARC)
36x HDD-A14T-MG07SCA14TE - Toshiba 3.5" 14TB SAS 12Gb/s 7.2K RPM 256M 512E
1x CBL-0082L - Y SPLIT SATA POWER ADAPTER
1x CBL-0097L-03 - 30AWG 50CM IPASS to 4 SATA W/50CM SB

Configured two pools:
1. Big pool for data that almost never changes with 3 vdevs of 11 disks in Z2 + 1 hot spare (34 disks in total)
2. Scratch pool of 1 vdev of 2 disks in mirror where data is changing all the time. Temporary disk where data is uploaded and processed before putting into big pool.

Currently using FreeNAS 11.3-U5, probably will upgrade to TrueNAS 12.0-U2, when it will be released. There is one free slot for another hot-swappable NVMe drive at the back of server, but I guess to use it for new feature on metadata on SDD I would need 3 to match Z2 redundancy (or I'm wrong?) and changes applies only to new writes, so probably I will stick to one L2ARC as I have now.

Currently server is getting filled up with data. Probably configuration above is a little bit of overkill for current load, but it should be capable of dealing with much more heavier loads that could generate warm storage use in my particular case in the future.
 
Top