SOLVED Disk full - Can't delete any files. Please Help!

Status
Not open for further replies.

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
The real problem is that you can fill things to the point where some metadata block that needs to be updated is larger than the free space available on the disk. ZFS is a copy-on-write system so if it cannot write out the updated metadata, you're hosed - even if that metadata would be part of an operation that ultimately removes a file and clears up space. I'm simplifying this just a bit.
I get that, it just seems like space reservation should be an actual solution, but right now it just moves up the point at which you'll fill the pool.

Would the solution be to make space reservation apply only to data and not to transaction metadata?
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
Isn't the point of a quota to stop introducing data? In our case, according to @cyberjock, setting a quota effectively set our usable disk space to a lower amount. Why in the world would I ever set my usable space to 4.3TB if I could have had 5.4TB all along? Why would I limit myself to less storage if I'm not actually stopping the influx of data? The quota system is therefore not a true quota system.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I get that, it just seems like space reservation should be an actual solution, but right now it just moves up the point at which you'll fill the pool.

Would the solution be to make space reservation apply only to data and not to transaction metadata?

Possibly. That runs into other accounting issues though. What happens when you have snapshots, for example?

The space reservation strategy has the primary benefit that you aren't required to truncate potentially valuable datafiles in order to unscrewup the system.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Isn't the point of a quota to stop introducing data? In our case, according to @cyberjock, setting a quota effectively set our usable disk space to a lower amount. Why in the world would I ever set my usable space to 4.3TB if I could have had 5.4TB all along?

Because you never should have had 5.4TB...? ZFS expects you to keep some free space. If you don't have the discipline to do that yourself, a quota can help, I suppose.

Why would I limit myself to less storage if I'm not actually stopping the influx of data? The quota system is therefore not a true quota system.

No idea what that means.
 

macxs

Dabbler
Joined
Nov 7, 2013
Messages
21
Hi fracai,

I'm not too much into ZFS to be able to understand the performance impacts these checks would have within every transaction.
But I would guess it's not that easy ;-)
Maybe it's easier and cleaner to trick with the calculation of quota.

Can this issue be cleared by replacing the drives with larger capacity and resilvering?
If that was a question for me: You can overwrite a file with null-byte, then delete it. My real world problem is not that huge, I just increased dataset quota :smile:
But I did not want this to be silenced because I think this is not a user-predictable behaviour of a robust enterprise-FS.

Bye Marco
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
Possibly. That runs into other accounting issues though. What happens when you have snapshots, for example?
The space reservation strategy has the primary benefit that you aren't required to truncate potentially valuable datafiles in order to unscrewup the system.
So space reservation does work? And if the pool fills up you can just reduce the reservation, delete files to free up space, and increase the reservation again? I would have thought that increasing the reservation would require writing transactions to disk?

If space reservation does actually work then it seems like this is indeed a solved problem.
 

macxs

Dabbler
Joined
Nov 7, 2013
Messages
21
ZFS expects you to keep some free space. If you don't have the discipline to do that yourself, a quota can help, I suppose.
But the quota itself has all the disadvantages of the pool itself and should not be filled >80%. So what purpose beneath partitioning data (and keep other datasets working when one is filled) is a quota serving?

Bye Marco
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
Because you never should have had 5.4TB...? ZFS expects you to keep some free space. If you don't have the discipline to do that yourself, a quota can help, I suppose.



No idea what that means.

If the original 5.4TB was 100% then setting a "quota" of 4.3TB (80%) is actually 100%. THAT is not a quota, that is artificial capacity.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So space reservation does work? And if the pool fills up you can just reduce the reservation, delete files to free up space, and increase the reservation again? I would have thought that increasing the reservation would require writing transactions to disk?

If space reservation does actually work then it seems like this is indeed a solved problem.

Space reservation as you describe definitely works. The pool metadata is part of the whole pool, not the dataset subject to the reservation.

jkh suggests quotas. https://bugs.freenas.org/issues/3982 (at the end). I can't think of whether there'd be a benefit to one over the other, no coffee yet this morning.
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
Can you do this?: Set your volume quota to 80%. Fill it up. Get a message that you have reached your volume's quota. Go in and remove the quota and start replacing with bigger disks or removing data without taking the performance hit had you not had the quota?
 
Last edited:

macxs

Dabbler
Joined
Nov 7, 2013
Messages
21
I disagree with both of those statements. Well, at least in parts.

First, ZFS is robust, but that doesn't mean it needs to work in **every** situation.
No? Not in every situation it actually permits?

Second, I don't think I'd describe FreeNAS as "low-cost". Certainly the software can't get much cheaper
One very good point. Some companies sell ZFS. At a much higher price then FreeNAS :)

The hardware requirements are anything but.
Of cource. But you can choose. Compared with enterprise-level hw/sw-bundles from Netapp, EMC2, IBM or w/e you can choose hw that fits your needs to a much lower price.

getting a bit OT.

Bye Marco

edit: quotes removed
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
If the original 5.4TB was 100% then setting a "quota" of 4.3TB (80%) is actually 100%. THAT is not a quota, that is artificial capacity.

It can be whatever you want to call it. The quota system is intended to control disk usage in a dataset, etc. There's no reason you cannot use it to limit the pool capacity, but that's not the only intended use for the feature, and I suspect it wasn't even intended to be used for that at all.

In UFS/FFS, the filesystem takes its reservation off the top and then lies to you about the percentage full. ZFS doesn't do that. To-may-to, to-mah-to.
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
Excellent, I'll be implementing this as a safe guard today.

If my understanding is correct, I can create a dataset with a reservation of 1GB and then never touch it. If I ever let the system get to 100%, I can reduce that reservation to 0 and start deleting files and snapshots.

It seems like the space reservation would be more desireable as you can create one reservation dataset to safe guard all others. If you use quotas, you'd need to set a quota on every dataset and end up constricting your datasets.

With quotas you'd have to set the quotas such that their total is less than the total pool. With reservations you can restrict one dataset and let the others grow independently.

Corrections are welcome.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I suggest a slightly larger number. I was using 10G or 20G or something like that, intended to be big enough for a few transaction groups. I bumped that up to 1TB as kind of an inverse-quota...

Code:
# zfs get all storage0/.reserved-space
NAME  PROPERTY  VALUE  SOURCE
storage0/.reserved-space  type  filesystem  -
storage0/.reserved-space  creation  Mon Dec 17  8:00 2012  -
storage0/.reserved-space  used  209K  -
storage0/.reserved-space  available  6.06T  -
storage0/.reserved-space  referenced  209K  -
storage0/.reserved-space  compressratio  1.00x  -
storage0/.reserved-space  mounted  no  -
storage0/.reserved-space  quota  none  local
storage0/.reserved-space  reservation  1T  local
storage0/.reserved-space  recordsize  128K  default
storage0/.reserved-space  mountpoint  none  local


Also left it unmounted without a mountpoint.
 

diedrichg

Wizard
Joined
Dec 4, 2012
Messages
1,319
It can be whatever you want to call it. The quota system is intended to control disk usage in a dataset, etc. There's no reason you cannot use it to limit the pool capacity, but that's not the only intended use for the feature, and I suspect it wasn't even intended to be used for that at all.

In UFS/FFS, the filesystem takes its reservation off the top and then lies to you about the percentage full. ZFS doesn't do that. To-may-to, to-mah-to.
Gah, I'm not getting my thought explained properly... Doesn't a quota in Windows (server or client) act to stop a user from adding more data but the server all the sudden doesn't grind to a halt - there is still 20% remaining as overhead.

@jgreco, What do you think of my #50 post?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
A 1GB quota on a dataset you never touch doesn't do you a damn bit of good. It doesn't ensure you will ahve that much free space. It is possible to have a 1TB pool with a 500GB dataset that is set by a quota and *still* store 800GB of data on the pool.

Wow.. nobody seems to get how these work.. this is embarassing. It's covered all over the forums and Google.

The bottom line, if you set a dataset with a quota of 20% of your pool you accomplish nothing except have a dataset that does *nothing*.

If you set a dataset with a reservation of 20% of your pool you're pool will fill and you'll *still* be right where you are when you fill your pool. You'll likely have a few corrupt files that couldn't be written to the pool because you overfilled it.

Now let me be perfectly clear: A pool should NOT be filled over 80% full. If the notification emails and stuff aren't sufficient you're also failing to be a good ZFS admin for about a dozen other reasons. If emails aren't enough of a warning ZFS is not for you and you should look elsewhere right now. PERIOD.

Thinking you're going to use quotas and/or reservations to solve a full pool condition is nothing but a way for you to lie to yourself and then say you are doing good administration. You aren't. You look like a fool and you aren't saving yourself anything.

As for why ZFS has this 'limitation', pretend we have a pool that is 99% full. Your next write *might* free up disk space(such as a file delete or even shrinking a file size) or it might add more data. Your server has no way to differentiate between these 3 possibilities until they happen. Unfortunately there is no clean way to resolve for all 3 and you can expect that if you do run out of blocks you're going to lose some data. Again, see my paragraph above about "proper ZFS administration" if this is unacceptable.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Wow, Cyberjock, you totally failed there. fracai has it correct in post #53.

You can place a quota on the dataset to prevent it from filling the pool.

You can create a dataset with reserved space to prevent the pool from filling as well.

Neither of these guarantees that you won't have the filesystem freak out and report out of space, but having that limit there means that rather than praying you have enough space to do a metadata update that lets you delete a big file and free up some space (the "usual" ZFS fix), you can just twiddle a limit.

As for why ZFS has this 'limitation', pretend we have a pool that is 99% full. Your next write *might* free up disk space(such as a file delete or even shrinking a file size) or it might add more data.

This is also just plain wrong. ZFS is a copy-on-write filesystem. Your next write will allocate space and will NEVER free up disk space. It always consumes disk space. You always need free blocks. A write can allocate space and then deallocate previously used space, yes, but this isn't the same thing as freeing space. A filesystem that is not CoW can free space simply by deallocating some blocks, updating the freelist, and updating some metadata in-place. This is an incredibly important distinction.
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
A 1GB quota on a dataset you never touch doesn't do you a damn bit of good. It doesn't ensure you will ahve that much free space. It is possible to have a 1TB pool with a 500GB dataset that is set by a quota and *still* store 800GB of data on the pool.
I was talking about reservations, not quotas. And even touched on this issue of having a quota larger than the available space when I talked about having to set the quotas of all the datasets to total less than the pool size.

If you set a dataset with a reservation of 20% of your pool you're pool will fill and you'll *still* be right where you are when you fill your pool. You'll likely have a few corrupt files that couldn't be written to the pool because you overfilled it.
This was my question that I posed most recently in #46. Will a reservation alleviate the disk full issue. Not to keep files writing when the pool is full, but as a nicer solution than 'cat /dev/null > /some/big/file'.

When the pool fills and data doesn't get written, can you drop the reservation size and free up space by deleting files and snapshots? Fully understanding that this won't restore the data that couldn't be written.

Now let me be perfectly clear: A pool should NOT be filled over 80% full. If the notification emails and stuff aren't sufficient you're also failing to be a good ZFS admin for about a dozen other reasons. If emails aren't enough of a warning ZFS is not for you and you should look elsewhere right now. PERIOD.
As I, and a few others have stated in this thread, these are safe-guards to protect against the issue of not being able to modify the pool. If some process goes haywire and suddenly starts writing terabytes of data and fills the pool, it would be a nice way to remedy the situation without having to resort to hacks like truncating files. Setting a reservation still feels like a hack, but it's at least using ZFS methods to resolve the space issue. I would not be surprised if truncating a file not requiring a transaction to be written is considered a bug that gets fixed some day.

Thinking you're going to use quotas and/or reservations to solve a full pool condition is nothing but a way for you to lie to yourself and then say you are doing good administration. You aren't. You look like a fool and you aren't saving yourself anything.
Of course a pool should stay below 80% in order to retain the best performance, but things happen. I wouldn't be happy if an admin came to me and said a pool was dead because something that shouldn't be allowed to happen did happen. I'd be upset that they didn't have a plan to remedy the situation.

The people in this thread have been discussing a legitimate concern with the implementation of ZFS and looking for good ways to mitigate that issue. Again, it isn't that ZFS can fill up at all, it's that the pool becomes unusable when it does. ZFS should at least allow you to delete files and destroy snapshots and datasets. Keeping a metadata reserve does not sound out of hand at all. But, while this thread has discussed how ZFS could be changed, the primary concern is how to get out of that state. If it's true that quotas and space reservations won't alleviate the issue we're back to truncating files.

And stating that ZFS can't figure out what is going to happen is ridiculous. ZFS knows what the file operation is and it can calculate what metadata is going to change. It can then respond that the operation is allowed or not. It's only extreme situations that I can imagine that would still get into locked out pools if there was a transaction reserve.

Finally, I've seen enough conflicting reports from people that I trust on **both** sides of this discussion that I'm not willing to rule out options yet. If I have to create a new pool and test it myself I will. And I won't consider the result, whatever it is, to be foolish.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Wow, Cyberjock, you totally failed there. fracai has it correct in post #53.

You can place a quota on the dataset to prevent it from filling the pool.

Yes, but that screws up other things. So you set a quota. You can still fill the dataset to capacity and it will *still* freak out. But you just buttf*cked yourself because the 80% emails won't be sent unless the *pool* goes over 80%. So you are very likely to compound the problem with bigger problems and you didn't stop ZFS from freaking out (but may have put yourself in a position where you get no email but locked your dataset up). So gained nothing but definitely made your server more likely to have ZFS freak out. I'd call that a lose-lose situation.

You can create a dataset with reserved space to prevent the pool from filling as well.

Ok, but you didn't really change a damn thing. Pretend you have a 1TB pool. You create an empty dataset of 200GB of reserved space and never use the dataset. It's basically there to stop you from filling the server.

So what happens? You now have 800GB of disk space "free" and you'll start getting emails when you've put 600GB of data on the pool. Why? Because your 600GB of data plus the 200GB reserved dataset puts you at 80%. So not only did you not prevent a full pool condition (which is what everyone thinks they can somehow accomplish) but you're going to get the emails even sooner than before. I'd call that a lose-lose situation.

Neither of these guarantees that you won't have the filesystem freak out and report out of space, but having that limit there means that rather than praying you have enough space to do a metadata update that lets you delete a big file and free up some space (the "usual" ZFS fix), you can just twiddle a limit.

So we had a lose-lose with a quota and a lose-lose with a reservation. So I just explained how neither one will stop you from being stupid and locking out your pool. ;)


This is also just plain wrong. ZFS is a copy-on-write filesystem. Your next write will allocate space and will NEVER free up disk space. It always consumes disk space. You always need free blocks. A write can allocate space and then deallocate previously used space, yes, but this isn't the same thing as freeing space. A filesystem that is not CoW can free space simply by deallocating some blocks, updating the freelist, and updating some metadata in-place. This is an incredibly important distinction.

Right. I'm not talking about mid-transaction where you will *always* need more disk space. I'm talking about after the transaction is completed. After all, people are trying to argue that they want ZFS to identify when it's full and to not let it go too full. The problem is you have no way of identifying at the moment a transaction is created if at the end the pool will be more full or less full. So doing something like hardcoding 16KB of disk space to handle "the next transaction" won't gain you anything. That's all I was trying to explain.
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
I didn't see anyone mention it, but you can recover from a full pool issue by expanding your pool. At least I have seen it work before. But I can't guarantee it will work work for all cases of a full pool

You will then be able to delete data and use the pool as normal.

Of course you cannot shrink the pool again afterwards.

At least not yet...

In 2 weeks at the OpenZFS Developer Summit Matt Ahrens (ZFS founder) and Alex Reece are going to be giving a talk about vdev removal as a feature of OpenZFS soon.
http://open-zfs.org/wiki/OpenZFS_Developer_Summit_2014
 
Last edited:
Status
Not open for further replies.
Top