Convert/move a directory created via CLI into a dataset?

JV9 · May 27, 2023

Below is a snippet of zfs list on my TrueNAS Core:

Code:

tank/plex                                              15.2T  1.66T     15.2T  /mnt/tank/plex
tank/plex/media                                         200K  1.66T      200K  /mnt/tank/plex/media
tank/plex/media_gdrive                                  200K  1.66T      200K  /mnt/tank/plex/media_gdrive
tank/plex/media_local                                   256K  1.66T      256K  /mnt/tank/plex/media_local

tank/plex/media is mounted via mergerfs and combines media_gdrive (encrypted google drive with media) and media_local (download location sync'd periodically to gdrive)

I created a /mnt/tank/plex/newmedia directory via the CLI (not via the GUI, not a ZFS dataset) and spent the last several days moving 14TB of data off of google drive and into /newmedia.

Ideally, I could now (after unmounting and ensuring no processes are using it) remove tank/plex/media, mv newmedia to media, and mount it as the tank/plex/media ZFS dataset.

Is this doable? Or am I going about this the wrong way?

joeschmuck · May 27, 2023

You can destroy the '/mnt/tank/plex/media' dataset, then recreate '/mnt/tank/plex/media' dataset, it will be empty of course, or you could just delete all the data, whichever is faster I guess, and then 'mv' move the data from '/mnt/tank/plex/newmedia' to '/mnt/tank/plex/media'. There are a few ways you could do it. Stick with the one you feel comfortable with.

14TB, that is a lot of data.

Arwen · May 27, 2023

From what I am reading, you put the 14TB of data into the "tank/plex" dataset, via the "newmedia" directory. So you can't just rename it and expect it to be the "tank/plex/media" dataset. It already is in a dataset, but perhaps not the one you desire.

If you truly desire the 14TB of files to be in a child dataset of "tank/plex", like "tank/plex/media", then you have to move the files from where they are now, over to the different ZFS dataset.

One reason ZFS separates datasets, is that dataset can have different attributes. Thus, a simplified "re-linking" move would not work. For example, if the block size of the media dataset is increased to say 1MByte, but the original "newmedia" is the default of 128KByte, you can get increased efficiency for large files after the move.

JV9 · May 28, 2023

Utlimately, my goal is to move all of the contents of newmedia to media as quickly as possible.

Currently newmedia does not show up as a dataset via the GUI or zfs list, since I created the newmedia directory via the CLI.

In my experience with Unix-likes, moves within a filesystem are basically instantaneous, vs. copy or mv ACROSS filesystems.

In the freebsd/truenas world, are ZFS datasets considered separate filesystems? It seems like it's just an abstraction on top of the underlying filesystem, but I may be very wrong.

I'm going to do a test of a small directory via the method joeschmuck suggested above.

JV9 · May 28, 2023

It appears I am indeed wrong.

I just started a mv of a 39GB directory from newmedia to media and when I run top, I see cp is actually running instead...

So I guess behind the scenes, the OS is copying each file then deleting the original?

Once the move (aka copy) completes, some other processes take over and my load average goes over 30 and the CLI becomes laggy and unresponsive.

Ugh...

joeschmuck · May 28, 2023

Don't you love FreeBSD?

JV9 said:
Once the move (aka copy) completes, some other processes take over and my load average goes over 30 and the CLI becomes laggy and unresponsive.

For how long, any idea what those processes are? Is Plex by chance running? How about disabling any VM's/Jails you have running during this process?

JV9 · May 28, 2023

For too long... I haven't timed it but significantly longer than the actual "mv" operation took.

No jails or VMs running, including Plex.

Here's a snapshot of top output while the CLI is laggy:

Code:

last pid:  4259;  load averages: 28.82, 24.30, 16.70                                            up 0+01:08:15  09:32:51
61 processes:  5 running, 56 sleeping
CPU:  0.0% user,  0.0% nice, 39.3% system,  0.1% interrupt, 60.5% idle
Mem: 29M Active, 385M Inact, 963M Laundry, 11G Wired, 18G Free
ARC: 3122M Total, 29M MFU, 3073M MRU, 32K Anon, 8104K Header, 11M Other
     2883M Compressed, 2982M Uncompressed, 1.03:1 Ratio
Swap: 6144M Total, 6144M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
 1738 root         31  20    0   491M   295M kqread   6   0:39   0.08% python3.9
 3140 avahi         1  20    0    14M  4640K CPU1     1   0:03   0.03% avahi-daemon
 4259 root          1  20    0    14M  4176K CPU2     2   0:00   0.02% top
 2777 root          1 -52   r0    13M    13M nanslp   3   0:00   0.02% watchdogd
 2804 root          1  20    0    28M  2856K select   4   0:00   0.00% mountd
 2839 ntpd          1  20    0    21M  7136K select   7   0:00   0.00% ntpd
 2868 root          8  20    0    56M    15M select   6   0:09   0.00% rrdcached
 3422 root          1  20    0    20M  9092K select   7   0:00   0.00% sshd
 2962 www           1  20    0    38M  9764K kqread   1   0:00   0.00% nginx
 2920 root          1  20    0    52M    26M kqread   7   0:00   0.00% winbindd
 1801 root         18  20    0   117M    23M CPU5     5   0:00   0.00% libvirtd
 2927 root          1  20    0    91M    66M kqread   0   0:00   0.00% winbindd
 2782 root          1  20    0    13M  2700K select   3   0:00   0.00% rpcbind
 3264 root          1  20    0    13M  2560K select   5   0:00   0.00% nfsuserd
 3261 root          1  20    0    13M  2560K select   3   0:00   0.00% nfsuserd
 3262 root          1  20    0    13M  2560K select   5   0:00   0.00% nfsuserd
 3263 root          1  20    0    13M  2560K select   4   0:00   0.00% nfsuserd
 2813 root         16  52    0    12M  2672K rpcsvc   4   0:00   0.00% nfsd
 1854 root          3  20    0   284M   184M piperd   3   0:13   0.00% python3.9
 2971 root         11  20    0   114M    52M nanslp   4   0:11   0.00% collectd
 2030 root          3  20    0   239M   181M usem     5   0:11   0.00% python3.9
 2029 root          3  20    0   234M   181M usem     0   0:10   0.00% python3.9
 2027 root          3  20    0   233M   180M usem     4   0:09   0.00% python3.9
 2028 root          3  20    0   232M   179M usem     5   0:09   0.00% python3.9
 1900 root          1  20    0    65M    51M zevent   7   0:01   0.00% python3.9
 3071 root          1  52    0    13M  2568K ttyin    5   0:00   0.00% getty
 2955 root          1  20    0    15M  5504K nanslp   4   0:00   0.00% smartd
 2912 root          1  20    0   140M   112M kqread   1   0:00   0.00% smbd
 3170 root          1  52    0    40M    26M kqread   6   0:00   0.00% python3.9
 3377 root          1  20    0    20M  9068K select   7   0:00   0.00% sshd
 1721 root          1  21    0    19M  6580K select   3   0:00   0.00% zfsd
 3379 root          1  20    0    15M  5000K ttyin    6   0:00   0.00% zsh
 2755 root          2  20    0    38M    11M kqread   2   0:00   0.00% syslog-ng
 1740 root          1  20    0    23M    12M piperd   0   0:00   0.00% python3.9
 1705 root          1  20    0    11M  2020K select   1   0:00   0.00% devd
 3424 root          1  20    0    15M  4748K pause    6   0:00   0.00% zsh
 2812 root          1  52    0    27M  2564K select   4   0:00   0.00% nfsd
 3226 root          1  20    0    13M  2820K nanslp   0   0:00   0.00% cron

joeschmuck · May 28, 2023

hopefully it does not take up a lot of time and your system returns to normal soon. I guess the question is, will you be able to move all that data using this method and not be too annoyed by it.

JV9 · May 28, 2023

Just realized a zpool scrub had been running since 3am and had scanned 7TB.

I cancelled it, hopefully performance for the copy & CLI improve.

JV9 · May 28, 2023

For anyone curious, here's output from ps fax:

Code:

4495  0  I+     0:00.00 mv Music ../media
4496  0  D+     1:20.71 mv -PRp -- Music ../media/Music (cp)

pid 4495 is the command I ran on the cli
pid 4496 is what's actually happening

man mv has this to say on the subject:

Code:

     As the rename(2) call does not work across file systems, mv uses cp(1)
     and rm(1) to accomplish the move.  The effect is equivalent to:

           rm -f destination_path && \
           cp -pRP source_file destination && \
           rm -rf source_file

JV9 · May 28, 2023

Yup, it was the scrub that was killing performance.

Now the CLI is pretty unresponsive while the mv/cp is taking place, but goes back to normal as soon as the cp is done.

Live and learn....

joeschmuck · May 28, 2023

Glad it is working better. Hopefully the move will be fairly quick.

Arwen · May 29, 2023

Glad you figured it out.

Yes, ZFS datasets in the same ZFS pool act as different file systems, so a "mv" is actually a "cp". As I explained, their can be differences in the dataset attributes that a simple re-linking would not honor. Thus, potentially break the destination's dataset "rules" setup via the destination dataset's attributes.

ZFS does asynchronous writes by default, so it uses memory as a buffer to create a "transaction group" with a bunch of individual writes. In general, their can be 2 outstanding "transaction groups" pending to be written, before writes pause and wait. Each transaction group is complete with data, metadata, (aka directory info), and critical metadata, (aka free lists & Uber block at the top of the merkle tree). So when complete, any crash will not require any file system recovery. And if a transaction group write fails before Uber block update because of a computer crash, it will be like it never happened. Thus, ZFS is always consistent, no fsck needed.

This means that smaller writes that fit into 2 transaction groups seem to happen very fast. But, your move of 14TBs will be I/O bound based on the reads & writes of the same pool member disks.

Haibane · Mar 15, 2024

Same issue here. If it's not because I bought SAS drives i would have long used them as good old USB hard drives

Important Announcement for the TrueNAS Community.

Convert/move a directory created via CLI into a dataset?

JV9

Dabbler

joeschmuck

Old Man

Arwen

MVP

JV9

Dabbler

JV9

Dabbler

joeschmuck

Old Man

JV9

Dabbler

joeschmuck

Old Man

JV9

Dabbler

JV9

Dabbler

JV9

Dabbler

joeschmuck

Old Man

Arwen

MVP

Haibane

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Convert/move a directory created via CLI into a dataset?

Dabbler

Old Man

MVP

Dabbler

Dabbler

Old Man

Dabbler

Old Man

Dabbler

Dabbler

Dabbler

Old Man

MVP

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Convert/move a directory created via CLI into a dataset?"

Similar threads