Finished replication but the files are in pool root

CheeryFlame · Nov 8, 2023

I've finally finished transferring 35TiB of data to my backup server but I messed up and chose the pool as a destination.

I don't have enough space to do a replication from `tankname` to `tankname/datasetname` in example.

This won't work as well since it's not possible to rename a pool to a dataset.

Code:

zfs rename tank tankname/datasetname

Is there a power user way I'm not thinking about or shall I start to `mv` everything manually?

winnielinnie · Nov 8, 2023

What does your layout look like now on the backup pool?

Code:

zfs list -t filesystem -r -o space tank

I'm thinking you could create a new sub-root dataset under "tank", such as "tank/backups", and then rename each child "into" it. (Tedious, but doable.)

But before anything, just need to confirm the current layout.

I highly, highly advise you create a pool checkpoint (which you will later destroy if everything goes smoothly.)

CheeryFlame · Nov 8, 2023

I ran your command and the result is the following:

NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD
sloche 19.9T 34.5T 116G 34.4T 0B 525M

Patrick M. Hausen · Nov 8, 2023

Code:

zfs rename tank/sloche tank/media
zfs create tank/sloche
zfs rename tank/media tank/sloche/media

winnielinnie · Nov 8, 2023

gravelfreeman said:
I have 4 folders in sloche

Folders or datasets?

Because according to your output, "sloche" itself contains 35 TiB of data. (Not its children.)

Is sloche the root dataset? (AKA the same as the name of the pool?)

If you curate the output of a command, it can confuse the other person. I'm not sure if sloche is a dataset under "tank", or if sloche is the root dataset itself?

CheeryFlame · Nov 8, 2023

winnielinnie said:
Is sloche the root dataset? (AKA the same as the name of the pool?)

No. sloche is the name of the pool. There's no dataset.

The point of this post is that the data is in the root of the tank.

As you can see there's no dataset at all.

If you look at this screenshot it'll confirm you that there's no dataset and the data is just there sitting in the root of the pool.

winnielinnie · Nov 8, 2023

Is that how it exists on your source pool as well? A single root dataset with a bunch of folders inside? No children datasets?

CheeryFlame · Nov 8, 2023

winnielinnie said:
Is that how it exists on your source pool as well? A single root dataset with a bunch of folders inside? No children datasets?

No not at all. On the source server the 4x folders are in the dataset media.

This is how I created my replication. I thought it would have created the media dataset on destination but no. It'll only create them if you select more than one dataset in the source to replicate. Which was always my case before I had backups and media selected. But I moved backups to another pool on destination because there's not enough space on destination. That's how I ended up in this situation.

winnielinnie · Nov 8, 2023

gravelfreeman said:
I thought it would have created the media dataset on destination but no. It'll only create them if you select more than one dataset in the source to replicate.

I've been saying it for a while now. The GUI for Replication Tasks is counter-intuitive and ambiguous. You're not the first to be struck by this.

What you're supposed to do (which I don't even think is documented by TrueNAS) is to manually write "/media" after "sloche" in the destination text field.

Sorry that you got bit by this "not a bug" bug.

Since it's on your root dataset, and your destination pool's capacity is greater than 50%, I'm not really sure what you can do. (Other than starting all over again.)

CheeryFlame · Nov 8, 2023

winnielinnie said:
Since it's on your root dataset, and your destination pool's capacity is greater than 50%, I'm not really sure what you can do. (Other than starting all over again.)

Wow! Can't I move the files instead? The replication on lan is taking a lot of time. I already had to transfer everything over again because I had this box checked.

I must say that I'm having a sub par experience with TrueNAS at the moment and it's leaving a bad taste in mouth.

I hope they sees this post and fix the gui.

No one got time this much time to waste. It's been impacting my business and my family.

Patrick M. Hausen · Nov 8, 2023

Why didn't you do test cases with just a few Gigabytes, first? Similarly to your "keep only one snapshot on the destination" question. I explicitly recommended starting with, e.g. 3 days retention and then counting the snapshots on the destination to see what that actually does. Nobody here probably knows from the top of their heads. Go step by step. Do test cases first, do live data when the test cases work.

For your current problem you should be able to zfs create sloche/media and then move one folder at a time. I would use rsync to do a copy of a single folder to the new media dataset, because rsync can be restarted after an interrupt for whatever reason. Then after double checking delete the source folder.

But then I must askt the question, wasn't this supposed to be replication target? If you move the files, regardless if with mv or rsync, there will be no snapshot on the new media dataset that can serve as a common point for the next iteration.

So you will probably have to start over from scratch. Sorry.

CheeryFlame · Nov 8, 2023

Patrick M. Hausen said:
Why didn't you do test cases with just a few Gigabytes, first?

If you read carefully everything I've wrote in this topic you'll see that the destination was working as it should. There's no reason why one would have think of testing a test transfer when the only thing that has changed is "unchecking the backups dataset". The replication gui is poor and I agree with winnielinnie.

Patrick M. Hausen said:
Similarly to your "keep only one snapshot on the destination" question. I explicitly recommended starting with, e.g. 3 days retention and then counting the snapshots on the destination to see what that actually does.

I did followed your instruction and set it to 3 days.

As for why I initially sent 48tb of data is because I've been doing many replications during the past year and I was confident that I understood it's principle. Clearly I didn't.

winnielinnie · Nov 8, 2023

I wonder if it's possible to do a "dance" with clones and promotions to bypass this hurdle?

I never tried this, but in my mind it might look something like:

Create a snapshot sloche@move
Clone the snapshot sloche@move to a new dataset called sloche/media
Promote sloche/media
Destroy all snapshots of the root dataset sloche

I can't even say for sure if that will work.

If you try it, make a checkpoint of your pool first, as a safety net just in case.

EDIT: I already see a problem with this. Does cloning and promoting "carry over" all the snapshots of the former dataset? If not, then it goes back to square one like @Patrick M. Hausen said: you won't have a common base snapshot between your source and destination.

EDIT 2: I tried a test and it worked as expected. The latest snapshot I made named "@clonebirth-{timestamp}" carried all previous snapshots with it after I promoted it to a new dataset. This left the original ("pretend" root dataset) as an empty dataset with zero snapshots and zero data.

Davvo · Nov 8, 2023

Iirc you can use the in-place rebalance script in my signature, proven you have enough space to hold the biggest file.

Patrick M. Hausen · Nov 8, 2023

@gravelfreeman I really do not intend to add insult to injury. @winnielinnie's idea might work, give it a try. Things cannot get worse at the destination.

Just to make sure we are all on the same page - you are using PUSH tasks, aren't you? There's another thread with destination retention behaving in unexpected ways and that user was using PULL. I suspect with all the snapshot naming magic going on, PUSH has definitely less potential for surprises.

winnielinnie · Nov 8, 2023

I just did a test it and works...

@gravelfreeman, you might have a chance then.

The latest snapshot I made named "@clonebirth-{timestamp}" carried all previous snapshots with it after I promoted it to a new dataset. This left the original ("pretend" root dataset) as an empty dataset with zero snapshots and zero data.

I made a checkpoint before I did anything, and now I'm going to destroy the checkpoint since I don't need it.

winnielinnie · Nov 8, 2023

Ah, crap. Nevermind.

What ends up happening is the former dataset becomes the new "clone". I'm not even sure that's possible with a true root dataset. (My test was on nested child datasets, trying to mimic your situation. I didn't mess with my actual root dataset.) And even if it was possible, I don't think you'd want your root dataset to forever be a clone of your promoted child dataset snapshot.

CheeryFlame · Nov 8, 2023

Davvo said:
Iirc you can use the in-place rebalance script in my signature, proven you have enough space to hold the biggest file.

I do have space for the biggest file. Transfer is 35 TiB and space left is 20 TiB. Although I doubt the snapshots will be transferred no? I need to be able to do a replication from source to destination afterwards. The rebalance has to be done on the destination...

winnielinnie said:
Ah, crap. Nevermind.

What ends up happening is the former dataset becomes the new "clone". I'm not even sure that's possible with a true root dataset. (My test was on nested child datasets, trying to mimic your situation. I didn't mess with my actual root dataset.) And even if it was possible, I don't think you'd want your root dataset to forever be a clone of your promoted child dataset snapshot.

Damn I thought we were onto something then read the replies after your initial comments and saw it wouldn't work hehe! Thanks for trying.

Patrick M. Hausen said:
Just to make sure we are all on the same page - you are using PUSH tasks, aren't you? There's another thread with destination retention behaving in unexpected ways and that user was using PULL. I suspect with all the snapshot naming magic going on, PUSH has definitely less potential for surprises.

Yes I'm using push and not pull. ZFS replication task is setup on source.

winnielinnie · Nov 8, 2023

I honestly can't even tell you why there's such a thing as a mandatory root dataset for ZFS pools. I bet if you traced it back to its origins, you'd find out that the developers just shrugged their shoulders and figured "Eh, we'll just keep it that way. Who cares."

Unless someone can demonstrate the creation of a pool that contains no datasets within? Not even a root dataset. A pool in which you can have multiple root datasets?

Think of how much better ZFS would be if that were the case.

It's arbitrary why there's always a root dataset of the same name as the pool that exists upon creation of a brand new pool.

A pool is a pool. A dataset is a dataset. Pool commands affect the pool (features, scrubs, vdevs), and dataset commands affect datasets (snapshots, datasets, compression, encryption, recordsize, etc.) Yet for some reason they went with this weird relationship between a pool and its mandatory root dataset...

It removes flexibility and enforces strange rules and hierarchies (as seen in a situation like this.)

Ideally, we should have a brand new pool with zero datasets, such as:

BigPool

And then you create root datasets (yes, plural), such as:

BigPool
- robert
  - archives
  - documents
  - media
  - torrents
- sally
  - documents
  - media
- billy
  - archives
  - documents
  - temporary

See? Then you can replicate an entire root dataset from another pool and stick it in as another root dataset.

But nope. For some reason you have no choice but to have the one-and-only root dataset "BigPool" under the pool "BigPool". (It's tacky, but you can still create your own pseudo-root datasets.)

I just don't understand why they went this route with ZFS?

Can you imagine if partitioning was like this?

"You want to create partitions under Drive1? Sure! But you must always have a master partition named Drive1, and only within it can you create multiple partitions..."

Davvo · Nov 8, 2023

gravelfreeman said:
I do have space for the biggest file. Transfer is 35 TiB and space left is 20 TiB. Although I doubt the snapshots will be transferred no? I need to be able to do a replication from source to destination afterwards. The rebalance has to be done on the destination...

Yes, the snapshots won't be usable anymore.

Important Announcement for the TrueNAS Community.

Finished replication but the files are in pool root

Contributor

MVP

Contributor

Hall of Famer

MVP

Contributor

MVP

Contributor

MVP

Contributor

Hall of Famer

Contributor

MVP

MVP

Hall of Famer

MVP

MVP

Contributor

MVP

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Finished replication but the files are in pool root"

Similar threads