[vbox-dev] orgabout compacting a virtual disk
klaus.espenlaub at oracle.com
Fri Jul 30 11:50:25 PDT 2010
On 30.07.2010 19:54, Huihong Luo wrote:
> Can someone explain a bit on how to compact a virtual disk?
> We may have some resource to finish those un-implemented compact
> functions for VHD/VMDK/HDD format.
> What I can think of is something as follows:
> (1) open the disk file, and create a new disk file in the same format,
> or different format
That's technically speaking not called compaction in VirtualBox, that'd
correspond more to cloning (ignoring what happens with the image UUIDs).
The core of this is implemented in VDCopy() in VBoxHDD.cpp, and is
applicable to pretty much all image formats.
Compaction in VirtualBox works "in place", by going over the file,
identifying unused blocks and moving still used blocks in the gaps
(truncating to the correct size). So this is 100% image format specific,
and the generic code doesn't really participate. The only existing
compaction implementation is in vdiCompact() in VDIHDDCore.cpp.
A bit nitpicking, but let's be clear about what's what.
> (2) for each partition, mark free and used blocks. yes, we have code
> that is file system aware, it can recoginize all popular ones, such as
> Ext/Reiserfs/NTFS/FAT, etc.
That information could be used in either place. It's a bit of work to
distill the "chunk used" information out of this, as the format backends
have different granularity. Typical values are: VDI 1M, VMDK 64k, VHD 2M.
> (3) copy all used blocks from old disk file to new disk file.
> Would this be a good way of compacting a disk? If there are other simple
> ways, please let me know. I noted that VDI already implemented compact,
> how does that work internally?
See above for a quick summary. VDI compaction uses "entire block is 0"
as the criteria. It takes advantage of the fact that it has a special
"block is 0" marker, which VMDK doesn't have for example.
So the compaction strategy actually would need to be somewhat format
specific if one looks at squeezing out the maximum from diff images too.
For VMDK files making "old data" appear in the unused areas is the only
way to save space. Storing zeroes if the data as of the parent image
isn't zeroes needs space.
> Note it can be also used for converting between different formats.
Format conversions is the hobby of VBoxManage clonehd already.
All in all sounds very interesting. Good ingredients and the right
stirring could result in a couple really great features.
More information about the vbox-dev