How full (beyond 80%) can a pool get before a failure?

Macaroni323

Explorer
Joined
Oct 8, 2015
Messages
60
I am expanding my home TrueNAS Core v13.6 server from 8TB (4 4TB drives in 2 x 2 mirror) to 16TB (4 4TB drives and 2 8TB drives 3 x 2 mirror). A collection of 20+ years of family pictures, videos, music etc so really don't want to mess up. I have 2 8TB drives currently in an external case in RAID1 as my backup for the server, and the 8TB TrueNAS server is sunk up with those drives. I've now pulled the 2 8TB drives in the external RAID enclosure and replaced them with 2 16TB drives again in RAID1. I will get the 16TB RAID backup of the server all sunk up with the server and then move the old 8TB drives to the pool to expand the server from 8TB to 16TB. At that point all will be back to stability. :wink:

Now the real issue. Since FreeNAS v11.3 (or so) I've realized that there's a problem with the files on the server which shows up on the forum here:


I can view files on the server with no issues. I haven't actually copied many files to my computer of the 8TB server but every once and a while I run into the failure over the last few years. No problem... I copy the file (or directory) in situ and get <filename> - copy.<xxx> and I delete the original and rename the copy to the original name all is back to normal functionality. I originally thought maybe a few % of the 8TB was messed up with this issue but...

Alas, while copying my server to the new 16TB drive (on Win10 via the network) I am maybe 2TB copied and I've run into 30,000 files that are not able to be copied to the new 16TB backup drive due to that issue. True to form, I am able to copy the files or sometimes the entire directory, in situ, on the server and the copy is fine. So...

I have some directories that when copied will end up filling the server from 86% (6% over which is why I'm expanding it) to near 95-98% full. Well above the 80% recommended max. Is this going to cause data loss if I temporarily get near the maximum storage in the pool? I copy the directory, then bit compare the copy to the original (probably overkill), delete the original, and rename the copy the original name. At that point the pool is back down to about 86% . I do not want to use my 8TB drives to expand the server just yet since I haven't successfully confirmed the damaged files are 100% on the server. I will dismantle my new external 16TB backup after successfully getting all files copied (bit perfect) to the new backup, and I will reconfirm the server is exactly a copy of the original 8TB backup after all this copy-delete-rename antics are done.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Is this going to cause data loss if I temporarily get near the maximum storage in the pool?
No, of course not, but performance is likely to suffer and deleting stuff once you run out of space is always painful. I'd double-check that you have a reservation for a few GB in an empty dataset just to keep you from accidentally running out of space.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
Some users have seen the "filename too long" error because the files contain xattrs exceeding 2 MiB in size (not a useful translation of errno to NT status code to human readable form). This can happen due to server misconfiguration at write time of file (the most common case I've seen this is copy-pasted Samba vfs-objects thrown in auxiliary parameters, but particulars may vary), MacOS clients with extremely large resource forks, etc. Local copy strips xatts and makes it work (of course), but you lose what was in xattr.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Going past 90% will be an unpleasant experience.
Going past 95% will be an awful experience.

Fragmentation is going to be critical with so little space left: check with zpool list -o name,fragmentation poolname.
 
Last edited:

Macaroni323

Explorer
Joined
Oct 8, 2015
Messages
60
Going past 90% will be an unpleasant experience.
Going past 95% will be an awful experience.

Fragmentation is going to be critical with so little space left: check with zpool list -o name,fragmentation poolname.
<my pool> FRAG: 28%

I am only temporarily going to 90-95% during the recovery of these files that can't be copied. The number of files that have the "The filename you've specified is either invalid or too long - even though filename is not too long!" error is enormous. I am seeing the files all are in the mid 2019 time frame. FreeNAS (at that time) was giving my quite a few errors with hard drives showing checksum errors and read errors. Originally I though drives were failing and then I thought SATA cables were failing and then I decided to build a new server. My old motherboard, Intel D975VBX2 with XEON (in Pentium pinouts) was stranded at a max of 8G memory and I was stuck with FreeNAS. With my new ASRock C3758D4I-4L mobo, 32GB memory and 120GB SSD I moved the pool 4 x 4TB WD Reds to the new system and saw no issues now for maybe nearly two years. The old mobo was failing. Now that copying problem was there then but I originally thought is was a small number of files. I was wrong (probably well over 100k files were this way). The only way I've found to get out of this is to copy the files locally and then delete the originals. I have the 4 x 4TB WD reds in TrueNAS now and I have a backup of the entire TrueNAS on 2 x 8TB WD Reds on a My Book Duo enclosure (RAID1). I have removed the 2 x 8TB with files intact and am now trying to create a second backup on the My Book Duo with 2 x 16TB drives RAID1. There is the problem... The TrueNAS files can't be copied to the My Book backup. So I'm painfully copying the largest folder I can without taking the pool too close to full and then deleting the original (there is some binary compare checks going on during this duplication). There are also a few (about 10 so far) files that were not copyable in any way, and I had to read those with a binary editor and write them to the copy to get around that (yes some of the metadata is messed up but they act correctly for copying to another location on a Windows machine and after the binary read and write the two compare perfectly). I have gotten 2 read errors as I access some of the old files on one of the WD Red drives (ADA1) so I will look into that. I also got 1 "Multi Zone Error" and a "Raw Error Rate" of 63 on another drive (ADA4). Again as I hit these directories that have these uncopyable files it appears that the drives are experiencing some issues. Not sure if the failures were bad writes from the privious motherboard or what. So far ADA1 and ADA4 are on separate vDevs and they're simple mirrors (and the 2 x 8TB drives are a complete backup so... I have two WD Reds that are new and I'll replace them as soon as I get the 16TB backup full (but that requires clearing up the uncopyable file issue).
 
Top