SOLVED apache WWW dir via NFS with sync=disable

Status
Not open for further replies.

noescape

Dabbler
Joined
May 9, 2013
Messages
10
Hello,

I was searching whole day for an answer, so I am trying the forum.

I want to make a FreeNAS box, that will share data via NFS to Debian server with Apache on it. It will be around 500 web sites... less that 50 GB of data. The NFS will be mounted to only one server. The reason, why I don't made a storage right on Debian box, is because ZFSonlinux is not stable enough.

I understand that everybody is saying that sync=disable is huge trouble in case of power loss, but most of them use NFS to store whole virtual machine (whole OS). I want to share only the web data. If some 5 seconds of data will be missing, it wont be a big trouble. I don't plan to use a SLOG/dedicated ZIL.

My questions are:
1. Is it still dangerous / can I lost all of the data when I will use NFS with sync=disable?
2. since I will mount NFS share from only one computer, I will mount it with async option and some other options that will make NFSv4 stateless. I will also set very long expiration time for caches... Will the NFS actually send some sync writes when it will be mounted as async?
3. I want to set vfs.zfs.txg.timeout="2" (default is 5). Does it means that I can loose only 2 seconds of data?

Thank you.

To not be a jerk who only asks, I am going to try to answer some question here on forum...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
First, the fact that you are questioning all the ZFS manuals, websites, and people that have discussed sync to death means you are NOT the one that should be disabling it.

Now to answer your questions to the best of my ability:

I understand that everybody is saying that sync=disable is huge trouble in case of power loss, but most of them use NFS to store whole virtual machine (whole OS). I want to share only the web data. If some 5 seconds of data will be missing, it wont be a big trouble. I don't plan to use a SLOG/dedicated ZIL.

My questions are:
1. Is it still dangerous / can I lost all of the data when I will use NFS with sync=disable?

You won't "potentially lose 5 seconds worth". You'll potentially lose the entire pool. It is still dangerous because on a loss of power the pool might not be mountable. If you can't mount it your option is to recreate a new pool and restore your data to the new pool.

2. since I will mount NFS share from only one computer, I will mount it with async option and some other options that will make NFSv4 stateless. I will also set very long expiration time for caches... Will the NFS actually send some sync writes when it will be mounted as async?

I believe that will make all writes from NFS asynchronous. This means that if you update something like an excel spreadsheet and hit save you might end up with a corrupted excel spreadsheet that can't be opened if the server happens to lose power. The corruption would be limited to files that were being updated in RAM(unless you do a sync=disabled).

3. I want to set vfs.zfs.txg.timeout="2" (default is 5). Does it means that I can loose only 2 seconds of data?
No, that value controls the amount of data for a transaction group. It affects how much data can be stored in RAM ready for a write before it is actually committed to the zpool. You can potentially lose more depending on a bunch of circumstances.

What you really should be doing is not play these games with trying to tweak zfs to work how you want. Either pay someone to do it right or be ready to spend months(literally) learning ZFS inside and out before you are competent enough to do it on your own. There's a reason why there are guides titled ZFS Evil Tuning Guide and have introductory paragraphs like:

Tuning is often evil and should rarely be done.

First, consider that the default values are set by the people who know the most about the effects of the tuning on the software that they supply. If a better value exists, it should be the default. While alternative values might help a given workload, it could quite possibly degrade some other aspects of performance. Occasionally, catastrophically so.

Over time, tuning recommendations might become stale at best or might lead to performance degradations. Customers are leery of changing a tuning that is in place and the net effect is a worse product than what it could be. Moreover, tuning enabled on a given system might spread to other systems, where it might not be warranted at all.

Nevertheless, it is understood that customers who carefully observe their own system may understand aspects of their workloads that cannot be anticipated by the defaults. In such cases, the tuning information below may be applied, provided that one works to carefully understand its effects.

Tuning is no joke and can have serious(and fatal) consequences for your data. It's not like Windows where you change a registry setting and if you like the changes then you keep it. It can have very very dire consequences you won't realize until its too late. Personally, I consider anyone that is asking in a forum setting how to tune their ZFS system to be faster is NOT qualified to do said tuning. I don't tune and I'd never do tuning for someone because I don't want to find out later they lost data because of something I overlooked. The tuning options and their impact are very very complex and can't be read over a weekend, a couple of weeks, and understood clearly.

I'm betting some other people will be posting telling you that you are crazy for considering it... LOL. If you aren't able to tune on your own, from memory, then you probably aren't qualified to do it. For more than 99% of people that show up in the forum and say they have their own tuning settings I will automatically assume they are irresponsible for doing it. I make this assumption before I even see what settings they've used. Why am I so harsh with tuning? Because if you are asking how to tune in a forum setting you are by definition not even close to the required experience to intelligently and scientifically tune your system.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
You won't "potentially lose 5 seconds worth". You'll potentially lose the entire pool. It is still dangerous because on a loss of power the pool might not be mountable. If you can't mount it your option is to recreate a new pool and restore your data to the new pool.

Just to clear up why, at least as far as I understand.

sync=disabled, disables ALL sync writes, including zfs updating metadata about itself. Power failure or server crash, whatever, during metadata update == metadata corruption, and very likely an unmountable pool.

iscsi with sync=standard is better. As far as I'm aware, iscsi by default doesn't call for sync writes, so there's the performance of async there. But sync=standard still lets zfs do sync writes on metadata.

If you're stuck with nfs, I think I saw somewhere you can tweak your nfs client to request async writes. I'm not sure if that was specific to esxi or not. Then nfs will function similar to iscsi. ie, metadata is safe.

Not sure if that helps.
 

pbucher

Contributor
Joined
Oct 15, 2012
Messages
180
There is not reason to use iSCSI here. The reason you see iSCSI mentioned has a replacement for NFS is because of ESXi. ESXi won't do async nfs and iSCSI doesn't allow it to force sync so you can work around the sync with at the network level.

With a linux NFS client you can just tweak the mount parameters to specific async, you will want to investigate the NFS settings on the FreeNAS side also. But please for the sake of your data put sync back to standard and tweak your NFS settings.
 

noescape

Dabbler
Joined
May 9, 2013
Messages
10
Thank you very much guys.

Please understand again, I am not using esxi and I am sharing ZFS via NFS, not iscsi. If will be just web files to 500 websites. Mostly read. NFS is my choice because of snapshots. Snapshoting a virtual block device is complicated and corruption of data is most likely with iscsi. NFS will only corrupt files that are actually written to.

What I have learned from you:
sync=disabled, disables ALL sync writes, including zfs updating metadata about itself.

For that reason I wont disable it. I wont also play with any other setting of ZFS, to not corrupt the metadata.

Since the NFS will be mounted as async, the only sync write will be ZFS metadata. So does it mean I don't need a SLOG, I just need a fast drives?
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Yes, if most of the writes are async, then there's little need for an SLOG.

Async writes will be cached by zfs, in ram (arc), and flushed to disk in the next txg (transaction group).
 

pbucher

Contributor
Joined
Oct 15, 2012
Messages
180
Thank you very much guys.

Since the NFS will be mounted as async, the only sync write will be ZFS metadata. So does it mean I don't need a SLOG, I just need a fast drives?

Correct. Even fast drives are optional, you just need a bunch of RAM so ZFS can do it's caching. Not knowing a thing about your setup I'm going to make a blind guess that 8GB of ram or so is what you should be looking for. You could maybe skimp by with 6GB but that is pretty much the very bare minimum if you want the box to be useful.

Back to async setup for NFS, make sure you set the async mode checkbox on the NFS setting page and set "Number of servers" to the # of CPU cores you have.

You are on the right path with using snaps shots and ZFS. Also with that in place consider throwing together a cheap 2nd box and setting up replication to replicate those snapshots off site somewhere.
 

noescape

Dabbler
Joined
May 9, 2013
Messages
10
you guys are fantastic,

thank you again.

Yes I will have 64GB DDR3 ECC RAM for beginning. Later I can expand to 128GB, since I ordered 16GB modules.

My config:
supermicro SC826TQ-R800LP 2U
MB: X9SRL-F 1S-R,5PCI-E8(g3),-E8(g2),2GbE,10sATA,8DDR3-1600,IPMI,bulk
CPU: Intel Xeon E5-2603 - 1,8GHz@6,4GT 10MB cache, 4core, 80W,LGA2011
HBA: Supermicro AOC-S2308L-L8e(2308) SAS2HBA (ITmode) 2×8087,exp:122HD,PCI-E8 g3, LP
RAM: 4x 16GB 1600MHz DDR3 ECC Registered 2R×4,Samsung (M393B2G70BH0-CK0)
DRIVES:
3x Seagate Constellation ES.3 1000GB
3x WD RED 1000GB

three drives will be on Motherboard SATA ports and three on HBA sata ports. Mirrored. So if whole HBA fails, it should work. No expanders will be used, since it's problematic with sata drives. The chassis can hold 12 drives, so I can double the drives later. Different brand drives will be used to minimize drive fail in same time.


I discussed everything with supermicro reseller. Checked the MB and HBA. The HBA is default IT mode, so no flashing.

Thank you for pointing out the backup. I have to look at it soon, since snapshots are not a real backup.

The last think I cant decide whether to use hot spare or not. I see plenty of expert saying NO to spare drives and I dont know why. When a drive fails, it will took me couple of days to send a new drive to DC. I like the concept that a hot spare will step in immediately, so I will be in no-hurry to replace it. I just dont understand why they are saying NO to hot spares. Do you have an idea?

Thank you
 

pbucher

Contributor
Joined
Oct 15, 2012
Messages
180
Very sweet setup. Make sure you have version 13 of the firmware on the HBA. The driver code is tightly tied to the firmware and the current 8.3 version of FreeNAS & FreeBSD are built with the driver for version 13 of the firmware. Version 13 is very battle tested at this point, I don't know how much testing has been down the newer FreeBSD drivers and firmware versions at this point.

On the topic of spares: I'd probably go ahead and load a spare into the box since you don't have ready access to the server.

I think the advice stems from the fact that FreeNAS let's you specify a spare drive but it doesn't automatically do anything with it. I usually just mark the drive as a spare drive so I remember it's in the drive array(I've got several FreeNAS boxes and drive arrays and it's getting hard to recall which ones have spares in currently).

My procedure for drive failure is to 1st remove the spare drive from the array in the FreeNAS gui and then click Replace on the failed drive and then select the just removed drive. ZFS will then rebuild the pool using the new drive. You may have to delete the failed drive from the zpool, it seems variable if it always removes it or not upon replacement. Then again I've been lucky not to have to do this more then twice and there have been bug fixes & updates to FreeNAS so things may have changed some what.

The heart of the issue is that a drive marked as spare isn't available to replace a failed drive in the GUI(and probably the CLI also) and currently there is not automatic failure over to a spare in FreeBSD. Hopefully in 9.x someone will tackle the automatic replacement issue if they haven't already.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Look some factually incorrect information:
Now to answer your questions to the best of my ability:

You won't "potentially lose 5 seconds worth". You'll potentially lose the entire pool. It is still dangerous because on a loss of power the pool might not be mountable. If you can't mount it your option is to recreate a new pool and restore your data to the new pool.

Just to clear up why, at least as far as I understand.

sync=disabled, disables ALL sync writes, including zfs updating metadata about itself. Power failure or server crash, whatever, during metadata update == metadata corruption, and very likely an unmountable pool.
Or does anyone have any basis for the above besides their "feelings"? I'd settle for a semi-accurate description these synchronous ZFS metdata writes. If my posts weren't clear in the sticky I can add another.

With a linux NFS client you can just tweak the mount parameters to specific async, you will want to investigate the NFS settings on the FreeNAS side also. But please for the sake of your data put sync back to standard and tweak your NFS settings.
Big plus ONE. Leave ZFS sync settings alone.

The last think I cant decide whether to use hot spare or not. I see plenty of expert saying NO to spare drives and I dont know why.
What experts? I would imagine most of them would be in favor of spares, warm or cold. You will have to wait for zfsd for hot spares. IMHO, even then warm spares are your friend. Depending on how many spares you begin with, you may want to start with three-way mirrors and have the drives 'spared' in already.
 
Status
Not open for further replies.
Top