AlexGG
Contributor
- Joined
- Dec 13, 2018
- Messages
- 171
I have a set of files I think I want to back up. The dataset is autogenerated, and the original backup strategy was to just regenerate it should the storage fail. However, with time, the generation complexity increased, and the estimated time to regenerate is now about a month, expected to increase further. So I am looking to get a backup server, and I'm considering TrueNAS.
The current dataset size is about 15 TB in 7 million files and 150K directories. Microsoft deduplication cuts that to about 3TB (saving 12TB). The dataset is mostly variations of the same files, so that deduplication is rather effective.
Expected growth over the lifetime of a backup system is about 10x. The deduplication ratio is expected to remain the same, 5-to-1. So I am looking to back up about 150 TB worth of data deduplicated down to about 30 TB.
The rate of change is relatively slow. Every once in a while, maybe once a week, a group of files is added to the dataset. I intend to manually sync the primary storage and the backup, and keep the backup unit mostly powered down. No snapshots or versioning is required.
Please tell me what kind of hardware I should be looking for, considering a requirement for deduplication. I would have figured it out myself, except for that part. I can't seem to be able to find any consistent guidance on how to size for the deduplication.
The current dataset size is about 15 TB in 7 million files and 150K directories. Microsoft deduplication cuts that to about 3TB (saving 12TB). The dataset is mostly variations of the same files, so that deduplication is rather effective.
Expected growth over the lifetime of a backup system is about 10x. The deduplication ratio is expected to remain the same, 5-to-1. So I am looking to back up about 150 TB worth of data deduplicated down to about 30 TB.
The rate of change is relatively slow. Every once in a while, maybe once a week, a group of files is added to the dataset. I intend to manually sync the primary storage and the backup, and keep the backup unit mostly powered down. No snapshots or versioning is required.
Please tell me what kind of hardware I should be looking for, considering a requirement for deduplication. I would have figured it out myself, except for that part. I can't seem to be able to find any consistent guidance on how to size for the deduplication.