Does anyone use temp storage / scratch disk for downloads?

Status
Not open for further replies.

kschaffner

Cadet
Joined
Jan 28, 2016
Messages
9
I've been downloading and extracting files directly in my pool and I've read that can lead to fragmentation. I have a couple questions about this that information on is seemingly hard to find online.
First, is fragmentation of a pool caused by doing what I said above or is doing what I stated above a bad thing?
Second, does anyone here use an extra HDD/SSD as a download and extract location before moving the data to the pool?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
This is actually something that gets discussed periodically. I remember a year or two ago that someone asked almost exactly the same question. If you are downloading torrents directly to the pool, the file comes down in non-consecutive chunks to begin with. That makes in fragmented from the start. If you download it to one location on the pool and then copy it to another location, the copy eliminates the fragmentation of the file but when you delete the original file it leaves fragmented free space. That is the problem. There is no defragmentation utility for ZFS.
First, is fragmentation of a pool caused by doing what I said above or is doing what I stated above a bad thing?
Yes, this causes fragmentation.
Second, does anyone here use an extra HDD/SSD as a download and extract location before moving the data to the pool?
Yes. I don't know exactly who does what, but I download to separate storage. Transcode if needed. Then copy the finished file to main storage. So, because I handle my data that way, my fragmentation is very low:
Code:
# zpool list
NAME           SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
Backup        21.8T  16.4T  5.31T         -     0%    75%  1.00x  ONLINE  /mnt
Emily         43.5T  16.6T  26.9T         -     0%    38%  1.00x  ONLINE  /mnt
Irene         43.5T  17.2T  26.3T         -     0%    39%  1.00x  ONLINE  /mnt
freenas-boot  37.2G  10.4G  26.9G         -      -    27%  1.00x  ONLINE  -
 

kschaffner

Cadet
Joined
Jan 28, 2016
Messages
9
This is actually something that gets discussed periodically. I remember a year or two ago that someone asked almost exactly the same question. If you are downloading torrents directly to the pool, the file comes down in non-consecutive chunks to begin with. That makes in fragmented from the start. If you download it to one location on the pool and then copy it to another location, the copy eliminates the fragmentation of the file but when you delete the original file it leaves fragmented free space. That is the problem. There is no defragmentation utility for ZFS.

Yes, this causes fragmentation.

Yes. I don't know exactly who does what, but I download to separate storage. Transcode if needed. Then copy the finished file to main storage. So, because I handle my data that way, my fragmentation is very low:
Code:
# zpool list
NAME           SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
Backup        21.8T  16.4T  5.31T         -     0%    75%  1.00x  ONLINE  /mnt
Emily         43.5T  16.6T  26.9T         -     0%    38%  1.00x  ONLINE  /mnt
Irene         43.5T  17.2T  26.3T         -     0%    39%  1.00x  ONLINE  /mnt
freenas-boot  37.2G  10.4G  26.9G         -      -    27%  1.00x  ONLINE  -


Thank you for responding and filling me in. I ran the same command zpool list and I see 0% fragmentation as I just migrated my data to a new pool of disk and I went ahead and put in a spare 128GB SSD to use as a temp download and extract location before moving it to the pool as complete!!
 

garm

Wizard
Joined
Aug 19, 2017
Messages
1,556
For this and other reasons I dont "work" directly on my storage pool. I have a mirror set of SSDs for Jails, database writes and temp folders. I then write "persistent" data to my main storage pool. The only exception I have is my Lightroom workflow, there I work directly on the storage pool.
 

kschaffner

Cadet
Joined
Jan 28, 2016
Messages
9
For this and other reasons I don't "work" directly on my storage pool. I have a mirror set of SSDs for Jails, database writes and temp folders. I then write "persistent" data to my main storage pool. The only exception I have is my Lightroom workflow, there I work directly on the storage pool.


I was thinking of doing the same thing but I only have 1 jail running right now since I have freenas running on my ESXi host and everything else is VMs. I have been having trouble getting the metadata out of the plex jail onto an ubuntu vm where it will play nicely and not refuse to launch the service.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Fragmentation on ZFS is fragmentation of free space and it matters because every time you write anything it must find free space to execute that write. The more it needs to hunt for free space, the slower your writes will be.
 

freedombacon

Dabbler
Joined
Jun 22, 2015
Messages
23
I download torrents with the Transmission plugin. The torrents are in their own dataset, not a pool. Is there a way to see how fragmented that dataset is? Would moving the data once it's done downloading help reduce the fragmentation? I'm trying to help some distros by helping share their ISOs, so I would prefer to keep seeding as long as possible.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
The torrents are in their own dataset, not a pool.
I think you have a misunderstanding of the terminology with regard to ZFS. I will give you some links to review that I hope will give you a better understanding:

Slideshow explaining VDev, zpool, ZIL and L2ARC
https://forums.freenas.org/index.ph...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/

Terminology and Abbreviations Primer
https://forums.freenas.org/index.php?threads/terminology-and-abbreviations-primer.28174/

To try and give you a super short answer on that, a pool is a collection of disks that are controlled by ZFS and mapped into a single logical storage space. Inside that storage space, you can create datasets and any fragmentation of the dataset would affect the entire pool, not just that dataset. Copying the data to a different directory or different dataset would have the effect of defragmenting the file, but the free space on the pool would become more fragmented over time. This effect is magnified if you download multiple torrents simultaneously because each file being downloaded in a torrent comes down in randomly arranged chunks so the data is randomly arranged on disk as it is downloaded. When it is moved to a single contiguous file, it leaves randomly arranged void spaces on disk to be filled later by a subsequent write. ZFS will first try to use all the contiguous free space, but when it runs out of contiguous space, it will need to hunt through the random void spaces to find a place to put new writes. This slows the write process over time. The linked article ( https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFragmentationMeaning) from above, does explain it correctly but does not go into great detail in the explanation.
 

freedombacon

Dabbler
Joined
Jun 22, 2015
Messages
23
Yes, I know that part. My FreeNAS is mainly used for storing my backups, movies, and music; large files or sets of files all being written at once. I'm the only user. Bittorrent is the only thing that should be doing random writes. I would like to see the fragmentation by dataset if that's possible to confirm that.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I would like to see the fragmentation by dataset if that's possible to confirm that.
I already told you that is not how it works. Think of a swimming pool. Just because there is a shallow end and a deep end does not mean that the water is different water. It is all the same fragmentation.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Understood. Thanks.
I have a FreeNAS that has multiple pools and each pool can have different fragmentation, but all datasets within a pool are at the same fragmentation as the pool.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It is a valid approach to download to one dataset and then copy it over to defragment the file as much as possible. It's better overall in the case of torrents.*

* Merely an informed opinion, no real data behind it
 
Status
Not open for further replies.
Top