From FreeNAS 11.3 to TrueNAS Core 12 - Upgrade Stories

Tony Self

Contributor
Joined
Jan 30, 2017
Messages
130
I upgraded to the Beta train today without any major issues with my jails (plex, sonarr, Radarr, Sabnzbd, transmission).

i am now working on getting my Google drive mounted within a jail (see my other thread).
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
vfs_recycle was not designed with ZFS in mind. This isn't a TrueNAS 12 limitation. I have some plans to re-implement with awareness of ZFS dataset boundaries and remove the cross-rename dependency for these nested dataset setups, but this will happen no earlier than TrueNAS 12.1 because my hands are currently full with porting Samba 4.13 to FreeBSD / our code base, and porting our code base to work on Linux for scale.
There are 2 fundamental issues, and in a way, the ZFS one is more relevant.

In ZFS, the live current dataset transparently mounts subdatasets on the fly. So nested datasets look like nested folders.

But a ZFS *snapshot* behaves differently.

If it has subdatasets that were snapshotted at the same time (ZFS snap -r), those don't transparently mount. So they appear as completely disconnected objects. A filing system that is coherent and searchable as a transparently unified file system when "live", has recursive snapshots that are coherent in data terms, but do not present as a unified file system nor can be searched as a unified file system, within it's recursive snapshots.

They don't automount even though they could, and would be consistent, and that's how they appeared when live. They can't be searched as a unified whole but only as separate file systems.

That's why the recycled is a problem. It's not really because of Samba. It's because the .recyclers in each dataset, which are transparently searchable, and automounted as needed, in the live pool, don't behave the same way in snapshots. That means searching and exoloring recyclers is fundamentally different in live vs. recursively snapshotted copies of the same dataset. If ZFS handled snap automount the same as live automount, then hard links or symlinks between recyclers across subdatasets would still work the same within snapshots of a dataset and its children, as they do in live datasets.

So I think the biggest fix by far, here, is enhancing ZFS automount on demand for subdatasets (that works for the "live" pool), to also do automount on demand for snapshots too. Meaning, one can descend transparently from a snapshot of a dataset, to the recursively snapped snapshot of its subdatasets (where they exist), for snapshots that were snapped recursively.

(The reason for recursive snap as a requirement, is that its the only way to guarantee the dataset snapshots are/were consistent)

That doesn't just benefit Samba, but the entirety of ZFS.

Once that's done, a huge number of ZFS and recycler issues become much simpler. Its just a pain that what presents as a single file system when live, fails to present as a single filesystem as a recursive snapshot.

So sambas recycler, shadow copies/file history, and browsing through old versions of a file system, are just 3 very evident places it shows up.
 
Last edited:

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,544
That's why the recycled is a problem. It's not really because of Samba. It's because the .recyclers in each dataset, which are transparently searchable, and automounted as needed, in the live pool, don't behave the same way in snapshots.

The Samba recycle bin is created at <connectpath>/.recycle. It doesn't have any awareness of ZFS at all. This means that if you for some reason have:

Code:
[share1]
path= /mnt/dozer/share

[share2]
path = /mnt/dozer/share/subdir


You will have two recycle bins, and which one a file gets dumped in depends on the connectpath at the time of the unlink operation. For recycle bin though, it's not sufficient to simply be ZFS aware. If dataset is mounted at /mnt/dozer/ds1 and share is at /mnt/dozer/ds1/share1, then a strategy of having recycle bins at dataset/.recycle won't work. Ultimately, the samba recycle bin is a vfs hack. It happened to be a useful one, but it's vastly inferior to using snapshots / shadow copies. What I think would be more helpful on the ZFS / libzfs side would be to change the function to diff snapshots so that it can take a callback function as an argument. This would allow a ZFS-aware search of snapshots.
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
449
I updated rather early because I wanted this flag for my work on a nice Grafana dashboard:
View attachment 39924

Network, all VMs and jails running - no problem whatsoever. Subsequently I recreated all my jails and switched them from cloned to base jails for easier updates in the future. Jails are not yet entirely but should eventually be managed by Ansible alone.

SMB needed some fixes that @anodos was so kind to prioritise and roll out.

Since the BETA1: everything in perfect condition.

P.S. The only thing I do not like is the new allocation mechanism for swap. Essentially cuts my available swap space in half. If you agree, please vote for: https://jira.ixsystems.com/browse/NAS-106375
Who came up with the 2G default size, anyway? This is ridiculously small.
IMO you shouldn't swap at all if you have enough ram and things configured properly....
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
What I think would be more helpful on the ZFS / libzfs side would be to change the function to diff snapshots so that it can take a callback function as an argument. This would allow a ZFS-aware search of snapshots.
This sounds like something I'd like to.understand - could you elaborate and explain? Thanks!
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
P.S. The only thing I do not like is the new allocation mechanism for swap. Essentially cuts my available swap space in half. If you agree, please vote for: https://jira.ixsystems.com/browse/NAS-106375
Who came up with the 2G default size, anyway? This is ridiculously small.
The 2G swap originally wasn't intended as functional swap at all. That's why its that size.

With FreeNAS being a ZFS powered file server, there wouldn't usually be expected to be a need for swap. The OS wouldn't be swapping. It was designed primarily as a FreeBSD powered open source file server not the all singing VM-and-media-streaming platform it is now, and swapping vastly slowing down its operation. In fact the emphasis on RAM meant if it was swapping, you probably didn't pick your hardware appropriately anyway.

The 2G swap has just one function. ZFS historically couldn't shrink a vdev, only grow them. So if one mirror had 2 4TB disks and the replacement 4TB disk had 3 bad sectors, meaning fractionally smaller usable space, it wouldn't be able to be hot swapped in. So a small amount of disk was prevented from being grabbed by ZFS when vdevs were created, so that nominally equal size disks would always be able to be slotted in, even if actually a small bit smaller due to manufacturer differences or bad sectors. And that's the 2G swap space and its purpose.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,544
This sounds like something I'd like to.understand - could you elaborate and explain? Thanks!
libzfs has some iterator functions:
Code:
/*
* Iterator functions.
*/
typedef int (*zfs_iter_f)(zfs_handle_t *, void *);
extern int zfs_iter_root(libzfs_handle_t *, zfs_iter_f, void *);
extern int zfs_iter_children(zfs_handle_t *, zfs_iter_f, void *);
extern int zfs_iter_dependents(zfs_handle_t *, boolean_t, zfs_iter_f, void *);
extern int zfs_iter_filesystems(zfs_handle_t *, zfs_iter_f, void *);
extern int zfs_iter_snapshots(zfs_handle_t *, boolean_t, zfs_iter_f, void *);
extern int zfs_iter_snapshots_sorted(zfs_handle_t *, zfs_iter_f, void *);
extern int zfs_iter_snapspec(zfs_handle_t *, const char *, zfs_iter_f, void *);
extern int zfs_iter_bookmarks(zfs_handle_t *, zfs_iter_f, void *);

for most of these the second argument is an iterator callback function and the last argument is a void pointer to private data that gets passed into the iterator. This allows you to do things as libzfs iterates whatever it is.
There isn't an iterator for ZFS diffs. This is understandable because for the most part you get a zfs handle in the callback, but similar concept could apply for doing a zfs diff. perhaps get a struct containing information you get in the ZFS command.

This is just shooting from the hip. I haven't done any research into it. In principal though, libzfs gives enough functionality to be able to be quite creative with custom tools (with the understanding that for the most part this was originally intended to be a private API).
 
Last edited:

ThreeDee

Guru
Joined
Jun 13, 2013
Messages
698
Initially the upgrade didn't go well due to my Plex and Unifi Controller jails not starting .. but thanks to Yorick's thread and some hand holding from Jasonsansone to get me through the cause of my jails not starting .. everything is fixed and I'm in with both feet with 12 Beta now. :cool:
 

dcol

Dabbler
Joined
May 1, 2020
Messages
28
I would like to try updating from FreeNAS-11.3-U4.1 to TrueNAS 12 Beta2.
I have a simple system with one Pool with 3 datasets and one Jail using the Emby plugin
I use SMB 2/3, RSYNC, and SSH services only.

Is there anything I should be aware of before making the switch? I would, of course make snapshots and backups first.
My main concern is SMB since that is the main connectivity to the datasets.
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
I would like to try updating from FreeNAS-11.3-U4.1 to TrueNAS 12 Beta2.
I have a simple system with one Pool with 3 datasets and one Jail using the Emby plugin
I use SMB 2/3, RSYNC, and SSH services only.

Is there anything I should be aware of before making the switch? I would, of course make snapshots and backups first.
My main concern is SMB since that is the main connectivity to the datasets.
There are stricter requirements for custom config. For example, I used mine heavily to override usual GUI options. Wont work so well in 12.
But really, I'm in the same boat, heavily SMB dependent, and had no longer term issues.

I did have to redo some share and service options, but only because as I said I used that section to hack round with. There were bugs in beta1 but they seem gone in beta2.

short version you should be okay but may have minor config teething issues, and at a pinch absolute worst case, be prepared to create new shares to convince yourself SMB really is working :) You shouldn't hit SMB blockers for that kind of stuff. Check the jails. SSH and SMB and the pool itself are fine, and rsync doesnt seem like it would easily go wrong.
 

hescominsoon

Patron
Joined
Jul 27, 2016
Messages
449
I would like to try updating from FreeNAS-11.3-U4.1 to TrueNAS 12 Beta2.
I have a simple system with one Pool with 3 datasets and one Jail using the Emby plugin
I use SMB 2/3, RSYNC, and SSH services only.

Is there anything I should be aware of before making the switch? I would, of course make snapshots and backups first.
My main concern is SMB since that is the main connectivity to the datasets.
if you are not hacking the web interface it should work fine..i am running two 12 beta2 boxes that i upgraded from 11.3u4.1 without issue.
 

dcol

Dabbler
Joined
May 1, 2020
Messages
28
Thanks. Don't even know how or why I would hack the web interface. That is all default stuff. My configuration is very generic. Just wanted to know if I would encounter any SMB or permission issues. This is a production environment and I can't mess around with it while I try to figure out what went wrong. Fore warned is fore armed in this case.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,544
Thanks. Don't even know how or why I would hack the web interface. That is all default stuff. My configuration is very generic. Just wanted to know if I would encounter any SMB or permission issues. This is a production environment and I can't mess around with it while I try to figure out what went wrong. Fore warned is fore armed in this case.
BETAs should not be used in a production environment.
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
if you are not hacking the web interface it should work fine..i am running two 12 beta2 boxes that i upgraded from 11.3u4.1 without issue.
Sorry, that's incorrect, although it won't be noticed by most users.

There are a few changes in Samba and its objects that make it so. But only if you'd done some pretty heavy duty use of the custom config panels. Bear in mind these won't affect 99% of users, but for completeness:
  1. It used to be possible to add a samba section to specify config for IPC$ (special) shares. Since they dont have a path on FreeBSD, they had to be put into the global config. That won't work now. Any other share sections in global config won't work either.
  2. A VFS in 11.3 doesn't exist in 12.0. If it was for any reason specified explicitly in custom config, that won't work now in 12 until the affected VFS is removed. I forget which it was.
Those were the main 2. As I said, almost nobody affected, but it's incorrect to say it's fine, vbecause in some cases users will have had to do either of those on 11.3, and they'll need fixing before samba will work correctly on 12-BETA2. Apart from those? Seems rock solid.
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
BETAs should not be used in a production environment.
Without in any way contradicting that, I needed the special vdev capability really badly (major project, dedup grinding because of DDT/metadata, needed special vdevs like yesterday to fix it).

So I checked explicitly with the devs how stable devs reckoned 12-beta1 was against basic data loss, and cautiously migrated to 12-beta1 (but backup pool left not upgraded so I could revert).
That was 7 weeks ago. It's been really good. I hammer ZFS locally and via SMB, and also use SSH and iSCSI. The general rule is true, don't upgrade production/live/important systems to betas. But this time I did, cautiously, and can say the issues have been ones that were annoyances not severe, much as devs had told me. Still best to wait, but if for any reason you can't wait the few weeks till RC1, it looks really good, and I've now used heavily and seen no sign of data at risk.

The issues Ive had with beta2 have been things like, won't replicate (send/recv) the whole pool in one bite, heavy CPU use in some operations, middleware losing contact with backend now and then, ... stuff I can live with. When 12-beta2 came along, I upgraded the pools fully and ditched backward compatibility with 11.3.

Again that is not a recommendation in any way to rush it. But it may give an informal idea about the state of play of 12-beta2, if someone really has urgent need. As always, backups and data safety. But tl;dr - even as a beta, before RC1, I already trust it, but be prepared for occasional small issues and workarounds on stuff still being fixed.
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,374
This is the best place, to ask this very silly question.

I've been paranoid of upgrading disk feature flags for /several/ versions of FreeNAS.
I don't know, just how far behind I am, except to say, I'm behind, out of sheer paranoia.

I feel like, it might be wise to at least have my feature flags, match my final version, before I move to TrueNAS Core 12.1.1 (I'll skip the first one)
So, is there any risk, in me updating my feature flags on my system - I have multiple pools now, I think 4?
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
This is the best place, to ask this very silly question.

I've been paranoid of upgrading disk feature flags for /several/ versions of FreeNAS.
I don't know, just how far behind I am, except to say, I'm behind, out of sheer paranoia.

I feel like, it might be wise to at least have my feature flags, match my final version, before I move to TrueNAS Core 12.1.1 (I'll skip the first one)
So, is there any risk, in me updating my feature flags on my system - I have multiple pools now, I think 4?
I'll defer to anyone who knows more, but this is my experience and understanding.

Many feature flags don't actually change things much. They enable features. If the feature flags aren't upgraded on a pool, then that pool remains guaranteed not to be using those features, and also guaranteed to remain usable (R/W) by older and forked versions of ZFS as used in previous releases or other *nix platforms. If you enable a feature, the pool ceases to be importable by other versions of ZFS that don't support the feature, whether it's actually being used or not.

You don't ever have to upgrade the feature flags for ZFS to work. All that'll happen if you don't, is that the pool will have those features disabled on it, when ZFS is doing things.

So enabling spacemap_v2 will tell ZFS it can use a more modern efficient type of spacemap to manage free space, which allows long objects to be included, but leaving it disabled just means ZFS won't use those, and will stick with short free space records, which other older versions of ZFS or other less featured platforms can also read.

That means you don't ever *have* to upgrade features. Indeed you can choose which to upgrade and which not, if you don't like some, or need to be able to transfer your pool to a system running FreeNAS 8.x or whatever.

But on the whole, the main reason for leaving features not upgraded is compatibility, not trust/reliability/"will it be okay". If you aren't ever going to go back to some older version of FreeNAS or some other platform running older ZFS anyway, then there isn't a need to keep the features disabled which that version of ZFS wouldn't recognise. It's that simple.

You don't really need to be "paranoid" of new feature flags as such. Really the issue is, which versions of ZFS/FreeNAS/*nix do you want to ensure you remain compatible with, because it's still conceivable you might move/revert your pool to them in future. Paranoia is not really merited beyond that. The feature flags are (or seem to be) very robust, and several of them may make *huge* differences to your pool's speed and efficiency.

Even if you won't use them, there's almost never harm in enabling anyway. Feature flags divide into 2 groups. Some, ZFS will automatically use because they are more efficient or faster (eg log_spacemap). Others, will only be actually used, if you yourself choose to use that feature (checkpoints, special vdevs etc). So you don't really have to cherrypick and wonder which to go for. Enable all, use the features you choose as you need them.

Last, be aware some features will only shine after existing data is rewritten, because existing already-written data won't be moved round and recoded on disk to reflect your flags if you do upgrade the feature flags. For example you could enable the "special vdevs" feature, but it'll only speed up new data created after that point, because your existing metadata and spacemap data won't be automatically moved to the new fast vdevs. For that, you need to rewrite the old data (usually with send/recv/replication).

I've added a list of current 12-beta flags below and some comments what they do, for reference. Meantime have some "personal story" about upgrading flags to 12-beta. It may help.


My own experience:

A personal case study might be one-off but might give an idea. I needed wanted to use a particular feature badly because dedup was grinding, and needed it badly enough to consider early moving to 12 even at beta.

This is usually a Very Bad Idea Indeed, but I asked the devs what they felt about safety and was told they thought it was pretty solid against data loss with no reports of pool loss, and so I've found it: 12-BETA1/2 have annoyances rather than blockers for me, but have basically been rock solid for data safety, even as betas).

I imported my "live" pool onto 12.0-BETA1 and left its feature flags unchanged so I could revert to 11.3 if there were issues. I then added enough disks to make a duplicate of it, with all the 12.0 flags enabled, to ensure that when replicated it would have all data stored optimally using the new features - a lot of flags only affect newly written data, they don't change how existing data is stored. (For that you need to remake the pool). By BETA2 I was confident enough to upgrade my old pool to the new flags as well, and rebuild that back from the copied pool, because I wasn't realistically ever going back to 11.3 based on my experience of 12 even at beta stage.

To give some ideas what motivated that decision, in moving from 11.3 to 12.0-BETA2 with a deduped pool, I've used special vdevs, and the new sequential resilver/scrub handling. And this is what I got:

  • Pool replication (40 TB data deduped down to 14 TB): 8 days down to < 24 hours
  • Scrub/resilver: 1.5-2 days down to < 12 hours
  • Large scale "snapshot destroy -r" (~ 15k snaps as a script of thousands of destroy -r NAME), several a second rather than ~ 5-30 sec each
  • Deleting large directories from Windows via SMB, 600~1000 files/second deleted (!!!) vs stalling ("calculating...") behaviour on 11.3.
  • Local and SMB file moves/copies 300-400 MB/sec rather than slow with stalls and breaking pipes - that's because dedup makes *massive* demands on 4k IO and under 12-beta I can put that data on SSD and leave everything else on HDD. My SSDs on 12-beta are pulling 1/4 million IO's per second when busy, which is why 11.3 was stalling, with dedup data on HDD it just couldn't keep up. 12-beta can.
I think you can see why I felt I *had* to move early. I should emphasise that if I wasn't using dedup, those wouldn't be as compelling, because dedup adds huge demands that 11.3 struggled with. So you may not see my specific gains.

But you wanted reasons to calm "paranoia" and that's the best I can give. Under huge load it's just... gorgeous... for me, with the new features that 12-beta has enabled. Not an issue whatsoever, no need for my pool to have a backup compatible with 11.3 and not doing so frees it from real issues 11.3 couldn't handle.

Full list of feature flags for 12-BETA2:

You can check what features are available on your system and which of them are not yet enabled ("disabled"), enabled or in use ("active") on your pools with this command: "zpool get all POOL_NAME | grep feature | sort". This is a full list, so many of these also existed on some older versions.
  • allocation_classes (ability to move metadata, and optionally dedup data and small files, off the pools HDDs onto dedicated high speed SSD vdevs reserved just for that purpose)
  • async_destroy (allows background processing of destroy)
  • bookmarks, bookmark_v2, redaction_bookmarks, redacted_datasets, bookmark_written (bookmarks and also redaction: ability to remove sensitive data when replicating with send/recv, eg for others to use a copy of it)
  • device_rebuild
  • device_removal (ability to remove top level vdev subject to certain conditions)
  • embedded_data
  • empty_bpobj
  • encryption (native encryption)
  • extensible_dataset
  • filesystem_limits
  • hole_birth
  • large_blocks
  • large_dnode
  • livelist
  • lz4_compress (and ZSTD on 12-RC1 when released) (compression algorithms - I'm adding zstd because that'll be in 12.0 rc1 but it's not yet in 12-beta2. ZSTD is said to be much more efficient and fast compared to the current "best choice" compression algorithm, as it's designed specifically to do a good job on the kinds of data seen and compressed at block level in a filing system not just general purpose compression uses.)
  • multi_vdev_crash_dump
  • obsolete_counts
  • project_quota
  • resilver_defer (ability to allow/prevent multiple resilvers running in parallel)
  • sha512, skein (hashing algorithms)
  • spacemap_histogram, spacemap_v2, log_spacemap (speed/efficiency improvements to free space processing)
  • userobj_accounting
  • zpool_checkpoint (ability to take an entire pool "snapshot" to revert changes to pool structure that wouldn't be preserved or roll-back-able, using ordinary zfs snapshot)
  • _txg
 
Last edited:

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
Read the excellent resource here: https://www.ixsystems.com/community/resources/zfs-feature-flags-in-freenas.95/

Both 11.3 and 12.0 add flags that become active once upgraded and can never revert to enabled. This means a pool upgraded on 11.3 or 12.0 becomes read-only in all prior versions of FreeNAS.

There are, in addition, features in 12.0 which, when used, make the pool unusable in all prior versions, until those features are all disabled again (may involve destroying the datastores that use the new features, such as encryption), at which point the pool becomes read-only again in prior versions.

TrueNAS 12 uses OpenZFS 2.0 and a pool created there or upgraded to those flags will import fine on any other OpenZFS 2.0 system, such as Linux.

I typically upgrade flags once I know I won’t go back to a prior version. For example, I am on BETA2 and have already moved my pool to that version, no way back to 11.3.
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,374
I'll defer to anyone who knows more, but this is my experience and understanding.

Many feature flags don't actually change things much. They enable features. If the feature flags aren't upgraded on a pool, then that pool remains guaranteed not to be using those features, and also guaranteed to remain usable (R/W) by older and forked versions of ZFS as used in previous releases or other *nix platforms. If you enable a feature, the pool ceases to be importable by other versions of ZFS that don't support the feature, whether it's actually being used or not.

You don't ever have to upgrade the feature flags for ZFS to work. All that'll happen if you don't, is that the pool will have those features disabled on it, when ZFS is doing things.

So enabling spacemap_v2 will tell ZFS it can use a more modern efficient type of spacemap to manage free space, which allows long objects to be included, but leaving it disabled just means ZFS won't use those, and will stick with short free space records, which other older versions of ZFS or other less featured platforms can also read.

That means you don't ever *have* to upgrade features. Indeed you can choose which to upgrade and which not, if you don't like some, or need to be able to transfer your pool to a system running FreeNAS 8.x or whatever.

But on the whole, the main reason for leaving features not upgraded is compatibility, not trust/reliability/"will it be okay". If you aren't ever going to go back to some older version of FreeNAS or some other platform running older ZFS anyway, then there isn't a need to keep the features disabled which that version of ZFS wouldn't recognise. It's that simple.

You don't really need to be "paranoid" of new feature flags as such. Really the issue is, which versions of ZFS/FreeNAS/*nix do you want to ensure you remain compatible with, because it's still conceivable you might move/revert your pool to them in future. Paranoia is not really merited beyond that. The feature flags are (or seem to be) very robust, and several of them may make *huge* differences to your pool's speed and efficiency.

Even if you won't use them, there's almost never harm in enabling anyway. Feature flags divide into 2 groups. Some, ZFS will automatically use because they are more efficient or faster (eg log_spacemap). Others, will only be actually used, if you yourself choose to use that feature (checkpoints, special vdevs etc). So you don't really have to cherrypick and wonder which to go for. Enable all, use the features you choose as you need them.

Last, be aware some features will only shine after existing data is rewritten, because existing already-written data won't be moved round and recoded on disk to reflect your flags if you do upgrade the feature flags. For example you could enable the "special vdevs" feature, but it'll only speed up new data created after that point, because your existing metadata and spacemap data won't be automatically moved to the new fast vdevs. For that, you need to rewrite the old data (usually with send/recv/replication).

I've added a list of current 12-beta flags below and some comments what they do, for reference. Meantime have some "personal story" about upgrading flags to 12-beta. It may help.


My own experience:

A personal case study might be one-off but might give an idea. I needed wanted to use a particular feature badly because dedup was grinding, and needed it badly enough to consider early moving to 12 even at beta.

This is usually a Very Bad Idea Indeed, but I asked the devs what they felt about safety and was told they thought it was pretty solid against data loss with no reports of pool loss, and so I've found it: 12-BETA1/2 have annoyances rather than blockers for me, but have basically been rock solid for data safety, even as betas).

I imported my "live" pool onto 12.0-BETA1 and left its feature flags unchanged so I could revert to 11.3 if there were issues. I then added enough disks to make a duplicate of it, with all the 12.0 flags enabled, to ensure that when replicated it would have all data stored optimally using the new features - a lot of flags only affect newly written data, they don't change how existing data is stored. (For that you need to remake the pool). By BETA2 I was confident enough to upgrade my old pool to the new flags as well, and rebuild that back from the copied pool, because I wasn't realistically ever going back to 11.3 based on my experience of 12 even at beta stage.

To give some ideas what motivated that decision, in moving from 11.3 to 12.0-BETA2 with a deduped pool, I've used special vdevs, and the new sequential resilver/scrub handling. And this is what I got:

  • Pool replication (40 TB data deduped down to 14 TB): 8 days down to < 24 hours
  • Scrub/resilver: 1.5-2 days down to < 12 hours
  • Large scale "snapshot destroy -r" (~ 15k snaps as a script of thousands of destroy -r NAME), several a second rather than ~ 5-30 sec each
  • Deleting large directories from Windows via SMB, 600~1000 files/second deleted (!!!) vs stalling ("calculating...") behaviour on 11.3.
  • Local and SMB file moves/copies 300-400 MB/sec rather than slow with stalls and breaking pipes - that's because dedup makes *massive* demands on 4k IO and under 12-beta I can put that data on SSD and leave everything else on HDD. My SSDs on 12-beta are pulling 1/4 million IO's per second when busy, which is why 11.3 was stalling, with dedup data on HDD it just couldn't keep up. 12-beta can.
I think you can see why I felt I *had* to move early. I should emphasise that if I wasn't using dedup, those wouldn't be as compelling, because dedup adds huge demands that 11.3 struggled with. So you may not see my specific gains.

But you wanted reasons to calm "paranoia" and that's the best I can give. Under huge load it's just... gorgeous... for me, with the new features that 12-beta has enabled. Not an issue whatsoever, no need for my pool to have a backup compatible with 11.3 and not doing so frees it from real issues 11.3 couldn't handle.

Full list of feature flags for 12-BETA2:

You can check what features are available on your system and which of them are not yet enabled ("disabled"), enabled or in use ("active") on your pools with this command: "zpool get all POOL_NAME | grep feature | sort". This is a full list, so many of these also existed on some older versions.
  • allocation_classes (ability to move metadata, and optionally dedup data and small files, off the pools HDDs onto dedicated high speed SSD vdevs reserved just for that purpose)
  • async_destroy (allows background processing of destroy)
  • bookmarks, bookmark_v2, redaction_bookmarks, redacted_datasets, bookmark_written (bookmarks and also redaction: ability to remove sensitive data when replicating with send/recv, eg for others to use a copy of it)
  • device_rebuild
  • device_removal (ability to remove top level vdev subject to certain conditions)
  • embedded_data
  • empty_bpobj
  • encryption (native encryption)
  • extensible_dataset
  • filesystem_limits
  • hole_birth
  • large_blocks
  • large_dnode
  • livelist
  • lz4_compress (and ZSTD on 12-RC1 when released) (compression algorithms - I'm adding zstd because that'll be in 12.0 rc1 but it's not yet in 12-beta2. ZSTD is said to be much more efficient and fast compared to the current "best choice" compression algorithm, as it's designed specifically to do a good job on the kinds of data seen and compressed at block level in a filing system not just general purpose compression uses.)
  • multi_vdev_crash_dump
  • obsolete_counts
  • project_quota
  • resilver_defer (ability to allow/prevent multiple resilvers running in parallel)
  • sha512, skein (hashing algorithms)
  • spacemap_histogram, spacemap_v2, log_spacemap (speed/efficiency improvements to free space processing)
  • userobj_accounting
  • zpool_checkpoint (ability to take an entire pool "snapshot" to revert changes to pool structure that wouldn't be preserved or roll-back-able, using ordinary zfs snapshot)
  • _txg

A heck of a reply - thanks for this.
I'm ..... curious if I should upgrade to the beta after all.
Is it possible to change my train (yes) and switch to beta, then on next reboot, switch to the old version, if problems occur?

I recall changing train, being a real problem - does it lock out previous boot instances?
 
Top