SOLVED 100GB ZVOL using 212G with only 18GB of actual data on ZVOL

Status
Not open for further replies.

AndrewH

Dabbler
Joined
Aug 9, 2017
Messages
33
I've been trying to figure this out for quite some time. Most people say it is snapshots or metadata maybe, but this can't really be the case here.

For reference, I have a Windows 7 VM which has a ZVOL for storage. The ZVOL is sized at 100GB. The ZVOLs are all presented to the VM over virtIO. The actual data on the VM is about 8GB. I noticed that this ZVOL was slowly but surely getting bigger and bigger each day even though there is nothing on the disk, so the obvious question is, what is going wrong here. Here's how it all looks like if I run zfs list -ro space -t all rpool/data/ZVOLs/vm-102-disk-0
Code:
NAME																   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
rpool/data/ZVOLs/vm-102-disk-0										  228G   303G	 90.8G	212G			 0B		 0B
rpool/data/ZVOLs/vm-102-disk-0@zfs-auto-snap_weekly-2018-08-26-0447		-   695M		 -	   -			  -		  -
rpool/data/ZVOLs/vm-102-disk-0@zfs-auto-snap_weekly-2018-08-26-2159		-   611M		 -	   -			  -		  -
...followed by many many snapshots

I understand that the snapshots are over 90.8GB that is clear to me, even that I find quite pushing it for a windows machine, but the USEDDS size is stil 212GB which is extreme compared to what it should be and also compared to the other disks that are atached to this vm. For example disk-1 on the same VM which is a 1GB ZVOL seems far more reasonable:

Code:
NAME																   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
rpool/data/ZVOLs/vm-102-disk-1										  246G   812M	  407M	405M			 0B		 0B
rpool/data/ZVOLs/vm-102-disk-1@auto-20180615.0900-1m					   -   232K		 -	   -			  -		  -
rpool/data/ZVOLs/vm-102-disk-1@auto-20180616.0900-1m					   -   232K		 -	   -			  -		  -
....snapshots


I think something went terribly wrong with this zvol and I don't want to lose any data if possible. Worth mentioning that I turned off pagefile inside the VM thinking that it might be swapping to disk. What can I do here or does anyone have any ideea what might be causing this issue?

Later Edit: had another look at it right now, a full backup on the VM is 17,9GB, so that's the entire data spread on 3 ZVOLs. Second, rechecked the vblocksize and got the following:

Code:
rpool/data/ZVOLs/vm-102-disk-0  volblocksize			 512						 -


and on the other disk
Code:
rpool/data/ZVOLs/vm-102-disk-2  volblocksize			 16K					  -



I still find it quite extreme that 18GB of data can grow to in excess of 200gb that is over 1000% . I also find it hard to believe that this is only the blocksize that does this. Maybe there should be a big Warning in the tutorials and recomandations for Windows VMs. Also surprising that the 16K ones work.

so my question obviously now is, how can I best move/change that ZVOL for a 16K ZVOL without losing any data and if possible with minimal downtime on the VM?
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The actual data on the VM is about 8GB.
You sure about that? IIRC, a clean Windows 7 x64 SP1 install was something like 15 GB.

Are you using bhyve? How is the virtual disk setup?
 

AndrewH

Dabbler
Joined
Aug 9, 2017
Messages
33
You sure about that? IIRC, a clean Windows 7 x64 SP1 install was something like 15 GB.

Are you using bhyve? How is the virtual disk setup?

This VM was originally created on bhyve from the GUI following one of the tutorials available on the forums/wikis, can't remember. The VM had this initial vdisk, boots in EFI because that's the only way you get the VNC to work and is nothing more than an "app server" for some accounting and licensing software. It does nothing else. It is a Windows 7 x64 Pro with SP2 if I remember correctly.

The partitions are EFI, Recovery, SystemDrive about 30Gb, the rest as Datadrive.

As we later learned that playing back a snaphot of the whole disk would leave the various services in different states that weren't desirable we added another 3 1GB ZVOLs(these were created with the default settings, was a quick, fire is burning solution) that we can snapshot and restore separately depending on which service was causing problems.

More recently, as some of the VMs running on freeNAS were getting more and more important, we added a proxmox server on ZFS and most of the VMs including this one were moved with zfs send zfs receive to that machine.

Important at this point, the ZVOL was large even at this stage, it actually got smaller in the migration because it lost some of the snapshots. Everything fired up and after a reboot everything was working as before, but the ZVOL slowly continued to grow.

And that's where we are today. I'm posting on this forum because freeNAS people will definitely have more ZFS experience than the people at proxmox.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It is a Windows 7 x64 Pro with SP2 if I remember correctly.
Oh, I wish there was a SP2, would've saved probably a day or three of my life over the past five or so years.

Anyway, we need more information to properly narrow this down:
  • zfs list rpool, at a minimum with the basic space output
  • zpool list rpool
  • Under Windows, run the following commands:
Code:
diskpart
list disk
list volume


Without exiting diskpart, run
Code:
select disk n
list part

For all n contained in the output of list disk.
 

AndrewH

Dabbler
Joined
Aug 9, 2017
Messages
33
As requested,
Code:
root@as01:~# zpool list rpool
NAME	SIZE  ALLOC   FREE  EXPANDSZ   FRAG	CAP  DEDUP  HEALTH  ALTROOT
rpool   928G   495G   433G		 -	38%	53%  1.00x  ONLINE  -
root@as01:~# zfs list rpool
NAME	USED  AVAIL  REFER  MOUNTPOINT
rpool   676G   223G   104K  /rpool


From the Windows VM:

Code:
DISKPART> list disk

  Datenträger ###  Status		 Größe	Frei	 Dyn  GPT
  ---------------  -------------  -------  -------  ---  ---
  Datenträger 0	Online		  100 GB	63 GB		*
  Datenträger 1	Online		 1024 MB  1920 KB		*
  Datenträger 2	Online		 1024 MB  1920 KB		*

DISKPART> list volume

  Volume ###  Bst  Bezeichnung  DS	 Typ		 Größe	Status	 Info
  ----------  ---  -----------  -----  ----------  -------  ---------  --------
  Volume 0	 D					   CD			  0 B  Kein Medi
  Volume 1	 C				NTFS   Partition	 31 GB  Fehlerfre  Startpar
  Volume 2	 Z   Other Apps   NTFS   Partition   4998 MB  Fehlerfre
  Volume 3					  FAT32  Partition	100 MB  Fehlerfre  System
  Volume 4	 E   ELBA		 NTFS   Partition	990 MB  Fehlerfre
  Volume 5	 F   FiBu		 NTFS   Partition	990 MB  Fehlerfre
DISKPART> select disk 0

Datenträger 0 ist jetzt der gewählte Datenträger.

DISKPART> list part

  Partition ###  Typ			   Größe	Offset
  -------------  ----------------  -------  -------
  Partition 1	System			 100 MB  1024 KB
  Partition 2	Reserviert		 128 MB   101 MB
  Partition 3	Primär			  31 GB   229 MB
  Partition 4	Primär			4998 MB	31 GB

DISKPART> select disk 1

Datenträger 1 ist jetzt der gewählte Datenträger.

DISKPART> list part

  Partition ###  Typ			   Größe	Offset
  -------------  ----------------  -------  -------
  Partition 1	Reserviert		  32 MB	17 KB
  Partition 2	Primär			 990 MB	32 MB

DISKPART> select disk 2

Datenträger 2 ist jetzt der gewählte Datenträger.

DISKPART> list part

  Partition ###  Typ			   Größe	Offset
  -------------  ----------------  -------  -------
  Partition 1	Reserviert		  32 MB	17 KB
  Partition 2	Primär			 990 MB	32 MB


The main problem is with the ZVOL for disk 0. I think I can try to restore the VM from backup and restore it under a different name, thus making it create new ZVOLs for the VM.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I forgot to ask what the pool layout is and what block size the virtual disk uses.
 

AndrewH

Dabbler
Joined
Aug 9, 2017
Messages
33
I forgot to ask what the pool layout is and what block size the virtual disk uses.
The pool looks like this:
Code:
root@as01:~# zpool status
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 4h47m with 0 errors on Sun Aug 12 05:11:35 2018
config:

	NAME											  STATE	 READ WRITE CKSUM
	rpool											 ONLINE	   0	 0	 0
	  mirror-0										ONLINE	   0	 0	 0
		sda2										  ONLINE	   0	 0	 0
		sdb2										  ONLINE	   0	 0	 0
	logs
	  mirror-1										ONLINE	   0	 0	 0
		scsi-36782bcb06c14e70022d70d9f615d6d10-part1  ONLINE	   0	 0	 0
		scsi-36782bcb06c14e70022d70da2618d44cf-part1  ONLINE	   0	 0	 0
	cache
	  scsi-36782bcb06c14e70022d70d9f615d6d10-part2	ONLINE	   0	 0	 0
	  scsi-36782bcb06c14e70022d70da2618d44cf-part2	ONLINE	   0	 0	 0



the blocksize I allready mentioned in my first post, 512 for the ZVOL with problems and 16k for the other one. I tried the backup thing I was thinking that might work...seems like it did what I wanted it to do. Same VM, different ZVOLs:

Code:
NAME									  USED  AVAIL  REFER  MOUNTPOINT
rpool/data/ZVOLs/vm-102-disk-0			312G   200G   212G  -
rpool/data/ZVOLs/vm-102-disk-1			817M   200G   405M  -
rpool/data/ZVOLs/vm-102-disk-2			285M   200G   133M  -
rpool/data/ZVOLs/vm-110-disk-1		   23.1G   200G  23.1G  -
rpool/data/ZVOLs/vm-110-disk-2			462M   200G   462M  -
rpool/data/ZVOLs/vm-110-disk-3			156M   200G   156M  -


It is quite surprising that the blocksize would make such a big different. The blocksizes on the newly reated ZVOLs is 8k, was 512 before:
Code:
rpool/data/ZVOLs/vm-110-disk-1  volblocksize		   8K					 default
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I would have expected massive inefficiency at 512 bytes with RAIDZ, but not quite as much when using mirrors. Sure, there's plenty of metadata, but not 10x amplification.
 
Status
Not open for further replies.
Top