[vbox-dev] orgabout compacting a virtual disk

Klaus Espenlaub klaus.espenlaub at oracle.com
Fri Jul 30 18:50:25 GMT 2010


On 30.07.2010 19:54, Huihong Luo wrote:
> Can someone explain a bit on how to compact a virtual disk?
> We may have some resource to finish those un-implemented compact
> functions for VHD/VMDK/HDD format.
> What I can think of is something as follows:
> (1) open the disk file, and create a new disk file in the same format,
> or different format

That's technically speaking not called compaction in VirtualBox, that'd 
correspond more to cloning (ignoring what happens with the image UUIDs). 
The core of this is implemented in VDCopy() in VBoxHDD.cpp, and is 
applicable to pretty much all image formats.

Compaction in VirtualBox works "in place", by going over the file, 
identifying unused blocks and moving still used blocks in the gaps 
(truncating to the correct size). So this is 100% image format specific, 
and the generic code doesn't really participate. The only existing 
compaction implementation is in vdiCompact() in VDIHDDCore.cpp.

A bit nitpicking, but let's be clear about what's what.

> (2) for each partition, mark free and used blocks. yes, we have code
> that is file system aware, it can recoginize all popular ones, such as
> Ext/Reiserfs/NTFS/FAT, etc.

That information could be used in either place. It's a bit of work to 
distill the "chunk used" information out of this, as the format backends 
have different granularity. Typical values are: VDI 1M, VMDK 64k, VHD 2M.

> (3) copy all used blocks from old disk file to new disk file.
> Would this be a good way of compacting a disk? If there are other simple
> ways, please let me know. I noted that VDI already implemented compact,
> how does that work internally?

See above for a quick summary. VDI compaction uses "entire block is 0" 
as the criteria. It takes advantage of the fact that it has a special 
"block is 0" marker, which VMDK doesn't have for example.

So the compaction strategy actually would need to be somewhat format 
specific if one looks at squeezing out the maximum from diff images too. 
For VMDK files making "old data" appear in the unused areas is the only 
way to save space. Storing zeroes if the data as of the parent image 
isn't zeroes needs space.

> Note it can be also used for converting between different formats.

Format conversions is the hobby of VBoxManage clonehd already.

All in all sounds very interesting. Good ingredients and the right 
stirring could result in a couple really great features.

Klaus

> Thanks,




More information about the vbox-dev mailing list