Replace zil

Status
Not open for further replies.

xcom

Contributor
Joined
Mar 14, 2014
Messages
125
I have a zil enable on my freenas system. The physical ssd is giving me signs of issues and I would like to remove it to install my new ssd.

Is this possible without loosing any data or volume?

Any special procedures to follow to remove and existing zil and recreating the new one with the new ssd?

Thanks in advance for the help.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
I had no problems removing and replacing mine. On 9.2.1.6. I'd just be sure the pool is of the same version. A recent backup is always a good idea.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
There is no risk to offlining and replacing a ZIL. Just don't do what people sometimes do.... shutdown the server and just physically remove the ZIL. Properly offlining the ZIL results in no risk of anything going wrong as ZFS will simply stop storing data on the ZIL.
 

xcom

Contributor
Joined
Mar 14, 2014
Messages
125
Thank you all for the help!
Will try today.

Regards.
 

David E

Contributor
Joined
Nov 1, 2013
Messages
119
There is no risk to offlining and replacing a ZIL. Just don't do what people sometimes do.... shutdown the server and just physically remove the ZIL. Properly offlining the ZIL results in no risk of anything going wrong as ZFS will simply stop storing data on the ZIL.

The documentation has a pretty ominous warning (http://doc.freenas.org/9.3/freenas_storage.html#removing-a-log-or-cache-device):

"If the pool is running ZFSv15, and a non-mirrored log device fails, is replaced, or removed, the pool is unrecoverable and the pool must be recreated and the data restored from a backup. For other ZFS versions, removing or replacing the log device will lose any data in the device which had not yet been written. This is typically the last few seconds of writes."

Is this accurate? I can't comprehend a design where a clean unmount would cause data loss.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
The documentation has a pretty ominous warning (http://doc.freenas.org/9.3/freenas_storage.html#removing-a-log-or-cache-device):

"If the pool is running ZFSv15, and a non-mirrored log device fails, is replaced, or removed, the pool is unrecoverable and the pool must be recreated and the data restored from a backup. For other ZFS versions, removing or replacing the log device will lose any data in the device which had not yet been written. This is typically the last few seconds of writes."

Yep.. totally correct. Just wrote a very long winded example (two actually) for why redundancy of the slog is important, even after ZFSv15.


Is this accurate? I can't comprehend a design where a clean unmount would cause data loss.

All modern file systems have write caches of some kind. In Windows (NTFS) it is cached in RAM. ext2/3/4 uses RAM too. ZFS allows you to use an SSD so you have a non-volatile storage for writes before they are written to the zpool. Since ZFS is hellbent on protecting your data at any and all costs (and I literally mean any and all costs), this slog device is the solution to alternatively simply accepting that lost data is lost like every other file system and volume manager design that ever existed has been doing for decades.

The difference is that ZFS:

- Allows you to ensure that no data is ever lost due to a loss of power. It even allows you to use multiple copies on different disks to ensure the data isn't lost. (yep.. there's that "any and all costs" and "hellbent on protecting your data")
- Won't let you mount a zpool in a condition where an slog device is missing without the -m flag being used so that you can restore the slog device if it is accidentally disconnected from the system.

Everything I just said is totally true. The reality is that you have been blind to all the missed writes your desktops and servers have potentially lost over your IT career. So maybe instead of asking why you can't comprehend a design where a clean unmount would cause data loss you should instead ask the other file systems and volume managers to do what they are supposed to do and protect your darn data. :p

If you were around back in the Windows 95 and 98 days you'll remember how on an unclean shutdown a chkdsk was run on bootup automatically. Remember how it *always* found a problem and had to fix it (lost chains)? You do realize that every time it found a problem those lost chains were sectors that were marked as allocated and data may or may not have been written to those sectors, but then the allocation wasn't completed so the file system has no way of knowing what the data was for, if the data was even written, or what the actual condition of the chain is/was supposed to be? Every single time you clicked "fix" you were simply choosing to discard the lost data and continue the bootup.

Ouch!

Yep.... Mankind has been losing data for decades.

Welcome to the real world of not losing data anymore. ;)
 

David E

Contributor
Joined
Nov 1, 2013
Messages
119
Totally agree with all your points, although I think you missed a key word in my prior post that triggered your longer response:

Is this accurate? I can't comprehend a design where a clean unmount would cause data loss.

Also I realize I didn't state, clean unmount of the SLOG, not the entire pool... I was primarily put off by the out of date documentation that seemed to indicate there was no way to safely remove a SLOG device without the risk of data loss. Sorry for the cross post! And thank heavens for modern filesystems!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Yeah, I don't think I saw the "clean" in 'clean unmount'.

That being said, with ZFS v15, a clean unmount and an slog device absolutely meant the end of the zpool (boo). Glad that is fixed because I agree, a clean unmount should not be that destructive.

Likewise, after ZFS v15, you can still manually mount the zpool without the slog device. Naturally you'll have no way of proving if any data hasn't been commited from the software side of things, but us humans will know better if we shutdown the system cleanly ourselves.
 
Status
Not open for further replies.
Top