any way to copy sparse files faster

Status
Not open for further replies.

JerryS

Cadet
Joined
Apr 8, 2014
Messages
8
Hi,

I was wondering if there is a better way to move big sparse files around. I have some disk image files that are compressed at about <50:1 (524G -> 8.3G). When I copy these around with dd, cat of cp, it ends up shuffling all the bytes and the core is cpu pegged while the system flogs through lots and lots of empty file space. Even at 5GB/s out of cache, it still takes a while to do things with these files.

Is there a way to copy these files on the NAS in a way that just moves the compressed blocks? I figure with zfs being log based the answer is no, but I wanted to check.

thanks,
jerry
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Ok, so let me make sure I understand you....

Pool A has some really big file that compresses really well and you want to move it to _____?

Move it to your desktop, move it to another pool on the same server, move it to another pool on a different server? I'm confused.
 

eraser

Contributor
Joined
Jan 4, 2013
Messages
147
When you mention sparse files I think of files with long runs of zeros (or nulls) in them. If you compress a sparse file you will end up with a "normal" file (all those zeros are "compressed" into nothing). Are you asking how to speed up the copy time of a normal compressed file?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If the file is compressed, it contains the bare minimum (well, not really, but it's close if you're doing things properly) information required to reassemble the large file. You can't magically not transmit data that is required for the file to be reassembled and expect the file to be reassembled.
 

JerryS

Cadet
Joined
Apr 8, 2014
Messages
8
Sorry, I was away for a while.

These files are disk images of VMs from a old host system. I first put them on one pool and then needed to move them to another pool on the same machine. Since the files were big I though I would use them for some local performance measurements. ls -s showed that the files were only ~8GB, but the copy time was in minutes. Also, there were times where either cp or dd were at 100% on a core and there was no disk activity at all. I was mystified until I looked more closely at the dd output and it was saying that it moved over 500GB. An ls -l on the file confirmed that the logical size was 500+GB.

Based on this, I now do all my testing on compressed files so I don't get anything unexpected.

This is not a critical thing, but I was just curious whether it was possible to move things like this in the squeezed form. None of the obvious things worked. If I ever need to push them over the net, then it will be much more important.

thanks,
jerry
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Sorry, I was away for a while.

These files are disk images of VMs from a old host system. I first put them on one pool and then needed to move them to another pool on the same machine. Since the files were big I though I would use them for some local performance measurements. ls -s showed that the files were only ~8GB, but the copy time was in minutes. Also, there were times where either cp or dd were at 100% on a core and there was no disk activity at all. I was mystified until I looked more closely at the dd output and it was saying that it moved over 500GB. An ls -l on the file confirmed that the logical size was 500+GB.

Based on this, I now do all my testing on compressed files so I don't get anything unexpected.

This is not a critical thing, but I was just curious whether it was possible to move things like this in the squeezed form. None of the obvious things worked. If I ever need to push them over the net, then it will be much more important.

thanks,
jerry

So you're using ZFS compression?
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
Sorry, I was away for a while.

These files are disk images of VMs from a old host system. I first put them on one pool and then needed to move them to another pool on the same machine. Since the files were big I though I would use them for some local performance measurements. ls -s showed that the files were only ~8GB, but the copy time was in minutes. Also, there were times where either cp or dd were at 100% on a core and there was no disk activity at all. I was mystified until I looked more closely at the dd output and it was saying that it moved over 500GB. An ls -l on the file confirmed that the logical size was 500+GB.

Based on this, I now do all my testing on compressed files so I don't get anything unexpected.

This is not a critical thing, but I was just curious whether it was possible to move things like this in the squeezed form. None of the obvious things worked. If I ever need to push them over the net, then it will be much more important.

thanks,
jerry


If they're true sparse files, they show allocated space, but it's not actually allocated in the filesystem. It's conceptually closer to deduplication than compression.
In any event, you need a backup method that understands the filesystem and the sparse-ness of any files.

In the old days of ufs, dump(1) could do this, but with zfs, it'd probably zfs send and zfs receive. you might be able to do something with zfs send, essentially zfs replication to get the sparse files moved, but anything that just reads files is going to see the whole file and miss the sparse-ness of it when it writes the destination file.

It looks like you might be able to do it with the right options to tar or rsync (--sparse) but there seems to be much discussion on google about whether that actually works for moving VM images or not.
 
Status
Not open for further replies.
Top