[vbox-dev] stream optimized vmdk file layout??

Klaus Espenlaub klaus.espenlaub at oracle.com
Tue Oct 28 19:09:18 UTC 2014


On 28.10.2014 17:37, Stuart Maclean wrote:
> I know this is more one for the VMWare folks, but since (a) Virtual Box
> supports importing .ovf, which include a vmdk file in 'stream optimized
> format' and (b) I am already subscribed to this list, I thought it worth
> a try...
> VMWare's specs on VMDK disk layout (e.g.
> https://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf)
> explain that for the stream-optimized  variant of vmdk disks, the first
> 'compressed grain' follows the descriptor.

The VMDK spec is from our experience not in all cases describing the 
reality accurately... it is still very useful as it helps understanding 
the basics of the file format design, allowing to skip most of the 
reverse engineering effort. It doesn't eliminate sanity checking their 
claims by files which is produced by their products.

> However, in .vmdk files produced by the 'packer' tool (http://packer.io)
> it appears that the first grain's data does not follow the descriptor,
> but rather sits in the sector of the file after the number of advertised
> 'overhead' sectors (a field in the sparse extent header).
> Any one shed any light on this?

I see no conflicting information, i.e. I think the spec is implying what 
you observed. The image on page 11 is just outlining the overall 
structure. I wouldn't interpret it as overriding the general rules for 
sparse images (including padding and adding alignment sectors), because 
the streamOptimized ones are actually specialized variant of the 
widespread sparse images with a couple more rules to ensure that the 
image can be processed straight from a (non-seekable) stream if desired 
(but can be also accessed in a non-sequential way, using the grain 
tables etc.). In particular I found no statement that it's forbidden to 
create a descriptor which has lots of sectors containing zeroes at the 
end. Sure, it defeats the purpose of a compact image representation, but 
it's not a violation of the spec.

You could have a look what VirtualBox implements, it's open source. The 
code unfortunately (just like the spec) doesn't make it easy to figure 
out exactly what gets written where, because it is sharing a lot of code 
between the various VMDK variants which are supported, has some special 
code paths for writing out streamOptimized VMDKs without ever seeking 
backwards and the like...


> Thanks
> Stuart

More information about the vbox-dev mailing list