SOLVED Disk full - Can't delete any files. Please Help!

fracai · Oct 28, 2014

cyberjock said:
Ok, but you didn't really change a damn thing. Pretend you have a 1TB pool. You create an empty dataset of 200GB of reserved space and never use the dataset. It's basically there to stop you from filling the server.

So what happens? You now have 800GB of disk space "free" and you'll start getting emails when you've put 600GB of data on the pool. Why? Because your 600GB of data plus the 200GB reserved dataset puts you at 80%. So not only did you not prevent a full pool condition (which is what everyone thinks they can somehow accomplish) but you're going to get the emails even sooner than before. I'd call that a lose-lose situation.

No one is saying a reservation will keep the pool from reaching 100%. We're saying it will allow you to recover from the 100% scenario without using a hack like truncating a file.

So if the lose-lose in this scenario is that you get an 80% email earlier I wouldn't call it lose-lose. For one, it relies on an extreme example of setting a 20% reservation. I suggested 1GB, which in this example would be 0.01%. jgreco suggested 20GB, which would be 2%. In larger pools, that percentage goes down. And given the data required for even several transactions that are just deleting files and snapshots, I still feel like 1GB would be enough.

jgreco said:
Neither of these guarantees that you won't have the filesystem freak out and report out of space, but having that limit there means that rather than praying you have enough space to do a metadata update that lets you delete a big file and free up some space (the "usual" ZFS fix), you can just twiddle a limit.

cyberjock said:
So we had a lose-lose with a quota and a lose-lose with a reservation. So I just explained how neither one will stop you from being stupid and locking out your pool. ;)

I personally agree that quotas aren't the solution, but calling reservations lose-lose just doesn't make any sense to me.

It actually sounds like something that could be added to FreeNAS. There's already the chunk of data that is partitioned out of every disk for "swap" when the real reason seems to be to allow users to reduce the swap reservation if they need to replace a disk that doesn't have exactly the same number of sectors. Why not create another system dataset containing a single file explaining that it's there to alleviate pools that have reached 100% capacity?

fracai · Oct 28, 2014

SirMaster said:
I didn't see anyone mention it, but you can recover from a full pool issue by expanding your pool.
You will then be able to delete data and use the pool as normal.
Of course you cannot shrink the pool again afterwards.

I thight I asked about this, I meant to, but I did assume ZFS wouldn't be able to expand into the new space without first trying and failing to write transaction data to the existing pool.

At least not yet...
In 2 weeks at the OpenZFS Developer Summit Matt Ahrens (ZFS founder) and Alex Reece are going to be giving a talk about vdev removal as a feature of OpenZFS soon.
http://open-zfs.org/wiki/OpenZFS_Developer_Summit_2014

Well that's intriguing.

cyberjock · Oct 28, 2014

fracai said:
So if the lose-lose in this scenario is that you get an 80% email earlier I wouldn't call it lose-lose. For one, it relies on an extreme example of setting a 20% reservation. I suggested 1GB, which in this example would be 0.01%. jgreco suggested 20GB, which would be 2%. In larger pools, that percentage goes down. And given the data required for even several transactions that are just deleting files and snapshots, I still feel like 1GB would be enough.

You're missing out on the most important argument being made here. That the email notications are insufficient. If the email notifications were sufficient then we wouldn't have this thread at all. You'd get emails and you'd take action. Emailing is the *only* thing you can truly rely on to prevent problems. If the argument is that you want some way to prevent the pool/dataset from hitting a "full" condition (and ZFS going "solid" and being locked) and you want to use datasets with quotas and/or reservation to do that I just explained why those do NOT work.

If emailing is insufficient then you need to find another file system that will let you be less responsible for the consequences of filling a drive/RAID array.

fracai said:
I personally agree that quotas aren't the solution, but calling reservations lose-lose just doesn't make any sense to me.

It actually sounds like something that could be added to FreeNAS. There's already the chunk of data that is partitioned out of every disk for "swap" when the real reason seems to be to allow users to reduce the swap reservation if they need to replace a disk that doesn't have exactly the same number of sectors. Why not create another system dataset containing a single file explaining that it's there to alleviate pools that have reached 100% capacity?

The real reason is exactly what it sounds like. Swap is for swap. It is not a secret reason for reducing swap reservation for disk replacement. That's what we use it for, but that was not even a consideration for the devs. I know.. I talked to them about it and they had no idea about this. Remember, the devs get to use enterprise-class hardware. That includes hard drives. Enterprise class disks *always* come out to be the same size. Anything less would render their disks useless for not only ZFS but also for many backup software that do disk images as well as rebuilding hardware RAID arrays.

We're just smart enough to know of an alternative use for the swapspace. But the devs had no clue you could do this or that it was even a problem.

As for your comment for creating a file to explain its there to alleviate 100% capacity there is zero difference between your "use a single file" scenario and a reservation on a dataset. ZFS will treat them the same. Remember, a dataset with a reservation will automatically allocate that size to itself. It is literally immediately "used" disk space take from the pool and given to the dataset. tl;dr you create a 1TB file and a dataset with a 1TB reservation and the result is exactly the same. 1TB is "used" in the pool.

In short:

Emails are your protection from the pool going solid. There is no way to "hack" zfs by creating a dataset, file, zvol, etc with any kind of quota and/or reservation to get around this limit.

cyberjock · Oct 28, 2014

And SirMaster's stuff, if/when it gets released, is probably a year or so away from being implemented in FreeNAS. So this clearly is a feature that we're going to see in the distant future (if ever). They release the code and then later it's ported to FreeBSD. Then it's tested for a year (or more) before hitting -STABLE. Only then is it added to FreeNAS, and it would probably be in whatever release comes after that point, which could be 3-6 months after that. So we're talking 2016 most likely before you'd have a chance of seeing this feature.

SirMaster · Oct 28, 2014

Not sure if this might also be related to this issue which is already fixed in Illumos:
https://www.illumos.org/issues/4950
https://github.com/illumos/illumos-gate/commit/4bb73804952172060c9efb163b89c17f56804fe8

This fix has also been pulled into the Linux HEAD version too but I guess it's not in FreeBSD yet.
https://github.com/zfsonlinux/zfs/pull/2784

The workaround they said is to find a file that was written that is outside of a snapshot and truncate it. `zfs diff` can be used to do that. That allows you to delete an old snapshot.

If you don't have snapshots though seems like this is not your issue.

jgreco · Oct 28, 2014

cyberjock said:
Yes, but that screws up other things. So you set a quota. You can still fill the dataset to capacity and it will *still* freak out. But you just buttf*cked yourself because the 80% emails won't be sent unless the *pool* goes over 80%. So you are very likely to compound the problem with bigger problems and you didn't stop ZFS from freaking out (but may have put yourself in a position where you get no email but locked your dataset up). So gained nothing but definitely made your server more likely to have ZFS freak out. I'd call that a lose-lose situation.

That mostly seems like a deficiency in FreeNAS, not the strategy. The threshold should be tunable (if it isn't already).

You can call it a lose-lose situation if you want. As someone who works with storage professionally, I can tell you that I place a very high value on recoverability. I don't care too much if a dataset locks up. That represents a planning failure that can be addressed (allocate more space, buy more space, be smarter about using space, etc) that should never happen. But in the real world, these things do happen, and when they happen, they need to be recoverable. That means it isn't acceptable to trust that truncating a "big file" somewhere in the filesystem can solve the problem. Being able to reserve some emergency space, on the other hand, does allow for an appropriate fix.

Ok, but you didn't really change a damn thing. Pretend you have a 1TB pool. You create an empty dataset of 200GB of reserved space and never use the dataset. It's basically there to stop you from filling the server.

So what happens? You now have 800GB of disk space "free" and you'll start getting emails when you've put 600GB of data on the pool. Why? Because your 600GB of data plus the 200GB reserved dataset puts you at 80%. So not only did you not prevent a full pool condition (which is what everyone thinks they can somehow accomplish) but you're going to get the emails even sooner than before. I'd call that a lose-lose situation.

Seems desirable to me. But then again my gameplan isn't one of trying to fill every byte of storage available. For ESXi SAN storage (not FreeNAS), we have 8 x 2TB disks as 4 sets of 2TB RAID1. Each of those is provisioned to provide 1.5TB of space, because the SAN gripes above that threshold. Of the 6TB usable space that this results in, currently we're at 1.73TB used. Figure /that/ as a percentage of raw disk space! ;-)

Neither of these guarantees that you won't have the filesystem freak out and report out of space, but having that limit there means that rather than praying you have enough space to do a metadata update that lets you delete a big file and free up some space (the "usual" ZFS fix), you can just twiddle a limit.

Yup. The twiddling a limit is a massive win over the possibility of "your pool is fatally screwed because you lack even the space to do a truncation operation." Which is unlikely but possible.

So we had a lose-lose with a quota and a lose-lose with a reservation. So I just explained how neither one will stop you from being stupid and locking out your pool. ;)

False==true....? Because the whole point here is recoverability. Either way, the twiddling a limit is still a win.

Right. I'm not talking about mid-transaction where you will *always* need more disk space. I'm talking about after the transaction is completed. After all, people are trying to argue that they want ZFS to identify when it's full and to not let it go too full. The problem is you have no way of identifying at the moment a transaction is created if at the end the pool will be more full or less full. So doing something like hardcoding 16KB of disk space to handle "the next transaction" won't gain you anything. That's all I was trying to explain.

But it's the mid-transaction bit that's essential to the failure here. If you let the pool fill, bad things happen because of what happens in the middle.

The traditional ZFS solution is to manage space so that the pool doesn't fill all the way. You seem to disagree but you're not suggesting any workable solution other than "don't do that" (or did I miss your solution?) That's a turkey of a solution if you can instead do something that will at least allow an operator to intervene and manage the issue.

depasseg · Oct 28, 2014

SirMaster said:
Not sure if this might also be related to this issue which is already fixed in Illumos:
https://www.illumos.org/issues/4950
https://github.com/illumos/illumos-gate/commit/4bb73804952172060c9efb163b89c17f56804fe8

This fix has also been pulled into the Linux HEAD version too but I guess it's not in FreeBSD yet.
https://github.com/zfsonlinux/zfs/pull/2784

The workaround they said is to find a file that was written that is outside of a snapshot and truncate it. `zfs diff` can be used to do that. That allows you to delete an old snapshot.

If you don't have snapshots though seems like this is not your issue.

That's wierd. Looking at the issue, the bugfix wasn't "send alert email". It must be a different issue.

jgreco · Oct 28, 2014

cyberjock said:
1. emails are your protection from the pool going solid. PERIOD. If this isn't good enough, take your ball and go elsewhere. ZFS clearly isn't for you.

E-mails are not any sort of protection. Assuming that some human is available and reading their e-mail and can log in in .01 seconds to debug and take some magic action that will prevent the pool from filling is ridiculous.

2. There is no way to "hack" zfs by creating a dataset, file, zvol, etc with any kind of quota and/or reservation to get around this limit.
3. If #2 is unpalatable, see #1.

You are being ridiculous and pedantic. There are absolutely ways to mitigate the issue to allow more rapid recovery.

This is a non-discussion. I can't believe we're on page 3 of posts on this topic any people are still discussing this. We've had this discussion dozens of times on this forum and it never changes...

Possibly because you've taken an untenable position.

cyberjock · Oct 28, 2014

jgreco said:
E-mails are not any sort of protection. Assuming that some human is available and reading their e-mail and can log in in .01 seconds to debug and take some magic action that will prevent the pool from filling is ridiculous.

WHAT!? Why .01 seconds!? I've had people wait days (or weeks). You get an email at 80%, not at 99.99%. 80%! If your pool is filling so fast that you went from 79.9% to 100% full in .01 seconds you are a badass and I want to see your server! But the .01 second response time you assume is required is NOT required.. unless you are a badass.

The 80% is plenty of time for 99% of TrueNAS users to acquire funds to purchase more hard drives, get a quote, buy the product, ship the product, then install the product before they run out of space. What is that not good enough for the rest of the world?

jgreco said:
You are being ridiculous and pedantic. There are absolutely ways to mitigate the issue to allow more rapid recovery.

Remember the original argument bro. It wasn't how to allow for "more rapid recovery". It was to completely mitigate any and all requirement on the administrator from having to take action (aka no "recovery" rquired... a-la Windows NTFS where you just delete a file). Once the user(s) fill the pool (or dataset) only the admin can fix it. We've only been talking about the "unsufferable" task of requiring the admin to take action. People seem to want to be able to exclude the admin from being required from recovery, and that's just not possible, which I've tried to explain again and again. I've always tried to explain why you can't stop the pool from filling until it's solid with quotas or datasets. But you get warning emails that should be plenty of advance to take action. I'm not sure why you think it should be a tunable.

The bottom line is that you can manage your pool however you see fit. FreeNAS has the benefit of warning you if things get to 80%. You should get a daily email at 80% and an hourly email at 95% if memory serves me. If, through your own action or inaction, you've taken a dataset or pool solid, there is no recovery that the end-user can do. PERIOD. The admin will *always* be required to take action, which the OP made very clear is unacceptable when he said:

I'm getting the feeling that everyone is arguing over a bunch of stuff that isn't even related to their side of the story, so I'm bowing out of this thread. I've said all I need to say and either people are expecting the impossible, wanting something that's not technically feasible, or don't understand the discussion at hand. Maybe all 3. But I'm done here. I know how to manage my server, and I've helped quite a few people recover solid datasets and pools. I'm sure I have a firm grasp of the technical reasons for the problem as well as how to rectify them. I also have a firm grasp of how FreeNAS tries to proactively warn you if your pool starts to get too full. Other than that, I've got nothing more to add.

Remember.... someone brought up this earlier and it's *all* come from one post. I've been trying to answer him and everyone else is talking about other stuff.

macxs said:
But that's not a good solution. Why can't the FS reserve at least some KB for it's next transaction? Or why can't I reserve space that won't make the filesystem accidentially go dead? When the problem is known, why are there no mechanisms to prevent this?
In fact this means, that you cannot offer network storage from FreeNAS to users or services, especially where you cannot estimate the final size.

A server admin said "but that's not a good solution" referring to an admin having to zero out a file somewhere.

Good day gentlemen!

jgreco · Oct 28, 2014

Y'know, Cyberjock, sometimes threads evolve. I'm sorry, it must have been terribly confusing and frustrating to you to see people talking about what must have seemed like unrelated random issues.

..must...not... roflmao....

/me pokes pointy sticks at doggies for fun ;-)

fracai · Oct 28, 2014

cyberjock said:
You're missing out on the most important argument being made here. That the email notications are insufficient. If the email notifications were sufficient then we wouldn't have this thread at all. You'd get emails and you'd take action. The *whole* argument is that emails are NOT sufficient to prevent this condition (which I'm trying to politely say is idiocy). Emailing is the *only* thing you can truly rely on to prevent problems.

Is anyone actually making this argument? I don't think I've seen anyone actually argue that emails are at issue here. But, I'll bite. The emails are helpful, but no they aren't sufficient to prevent the problem. A process might error out and start filling the disk faster than you can reach a terminal to fix the problem. Or maybe the network went down and the email was lost. There are all sorts of reasons why the email isn't sufficient to prevent the disk from reaching 100%. The primary reason why it's not sufficient is that the email message itself doesn't stop the disk from filling. It requires administrator action to do that.

The actual discussion is that however it happened, the disk is full. How do you recover?

If the argument is that you want some way to prevent the pool/dataset from hitting a "full" condition (and ZFS going "solid" and being locked) and you want to use datasets with quotas and/or reservation to do that I just explained why those do NOT work.

No, that's not the issue. The pool is full, how do you recover is the discussion.

If emailing is insufficient then you need to find another file system that will let you be less responsible for the consequences of filling a drive/RAID array.

No, emailing is a feature of FreeNAS, and presumably other systems. It has nothing to do with ZFS. The solutions in this thread are applicable to any system that is using ZFS.

As for your comment for creating a file to explain its there to alleviate 100% capacity there is zero difference between your "use a single file" scenario and a reservation on a dataset. ZFS will treat them the same. Remember, a dataset with a reservation will automatically allocate that size to itself. It is literally immediately "used" disk space take from the pool and given to the dataset. tl;dr you create a 1TB file and a dataset with a 1TB reservation and the result is exactly the same. 1TB is "used" in the pool.

The file is there to explain what the dataset and reservation is for, not to take up space. The file would need to be a paragraph or so. The dataset would have a reservation of 1-20GB or 100GB. Whatever makes sense to preserve enough space for the transactions to free up space.

1. emails are your protection from the pool going solid. PERIOD. If this isn't good enough, take your ball and go elsewhere. ZFS clearly isn't for you.
2. There is no way to "hack" zfs by creating a dataset, file, zvol, etc with any kind of quota and/or reservation to get around this limit.
3. If #2 is unpalatable, see #1.

No one is trying to use hacks to keep the pool from going to 100%. We are looking for ways to recover access to the pool if it does get there.

This is a non-discussion. I can't believe we're on page 3 of posts on this topic any people are still discussing this. We've had this discussion dozens of times on this forum and it never changes...

I'm clinging to the hope that you just haven't read what people are actually discussing. Are you really trying to say people shouldn't be discussing ways to keep the pool useable in the event that it does get to 100%?

Is your position honestly that it's not worth discussing because the pool should never get there because FreeNAS sends email and that's clearly all that's required?

Remember the original argument bro. It wasn't how to allow for "more rapid recovery".

So now that it's about "more rapid recovery" are you ok with discussing that?
**edit** I take this back. The original post was about how to recover from this. Email alerts didn't come in until half way through page two. And that was today. The original post was in 2013 and this topic has indeed been brought back up several times since then. In other words, it's always been about recovery and there have been a few detours along the way.
**end-edit**

A server admin said "but that's not a good solution" referring to an admin having to zero out a file somewhere.

At least in my opinion it's not a good solution because it feels like a bug. ZFS is supposed to be copy on write, I wouldn't expect truncating a file to free up any space. What happens if I truncate that file and then try to restore it from a snapshot? Do I get the data? Or is it gone because I wrote to it from /dev/null? Does truncating the file only free up space when the pool is full? Is that because the metadata fails to write but the truncation succeeds? In every scenario this behavior sounds like a bug and I'd expect it to be fixed.

Reserving space that can be utilized if the rest of the pool hits 100% doesn't sound like a bug and it sounds like a reliable mitigation strategy.

jgreco · Oct 28, 2014

I think several of the posts by @fracai have summed this all up very nicely especially in #53 and #58. I am closing the thread and encourage people to avoid filling their pools, but if you do, being prepared with some mitigation strategies is an option.

Important Announcement for the TrueNAS Community.

SOLVED Disk full - Can't delete any files. Please Help!

fracai

Guru

fracai

Guru

cyberjock

Inactive Account

cyberjock

Inactive Account

SirMaster

Patron

jgreco

Resident Grinch

depasseg

FreeNAS Replicant

jgreco

Resident Grinch

cyberjock

Inactive Account

jgreco

Resident Grinch

fracai

Guru

jgreco

Resident Grinch

Similar threads

Important Announcement for the TrueNAS Community.

SOLVED Disk full - Can't delete any files. Please Help!

Guru

Guru

Inactive Account

Inactive Account

Patron

Resident Grinch

FreeNAS Replicant

Resident Grinch

Inactive Account

Resident Grinch

Guru

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Disk full - Can't delete any files. Please Help!"

Similar threads