[vbox-dev] Too Many Snapshots: XML_PARSE_HUGE and NS_ERROR_CALL_FAILED
klaus.espenlaub at oracle.com
Thu Jul 3 15:59:02 UTC 2014
On 01.07.2014 15:10, a. wrote:
> Thanks for your reply!
> We have created a testing environment which relies rather heavily on the
> Snapshot feature including, afaik, the nesting feature. As we've read
> elsewhere, the actual snapshot count is supposed to be unlimited which
> is why we did not expect this problem.
> For the time being we have decided to stop this feature when reaching a
> particular Snapshot count to clean up and merge existing snapshots and
> then carry on. Although that would mean losing particular states. In any
> case we already do have one VM that has exceeded that limit and are now
> facing the issue that we cannot clone or modify it, because we cannot
> register the vm. Do you see any way how we can register that VM so we
> can bring it into a working format?
That's a really tough one... you could try with an old (before 4.3.0)
generic linux package, because those use a libxml2 version which
predates the change in default nesting limit (the one we use allows
1024, which really is plenty).
> Also personally I am confused how the snapshot count is supposed to be
> unlimited although too much nesting results in stack overflows or forced
Some of the snapshot processing (especially operations on the entire
tree) is done with recursion, and that means each nesting needs a
certain amount of stack space. Given that the stack is limited to 128K
there isn't much each nesting can use... this is the real cause why we
limited the nesting depth. As mentioned, I wasn't aware of libxml2
making silly default changes. Never ran into them.
If someone wants to contribute a complete (and tested!) change which
increases the snapshot depth to say 500, we'd be listening. It's a
rather low priority item, because the limit is very high, and having
hundreds of them can decrease the VM performance (especially disk
access). Hard to tell by how much.
> Maybe I've misunderstood something about the usages of Snapshots
> in VBox: Is there a possibility to reduce the Snapshot-Disk-Nesting
> without losing the ability to revert back to a previous Snapshot-State?
> Even if we'd had, say, 300 Snapshots, would it still be possible to
> revert back to Snapshot #58? That is basically the feature we tried to
> use here.
I can't truly tell from this description if you have any means of
controlling the depth. My guess is you can't easily, because you're
recording the past of a particular VM run, or probably a sequence of VM
runs (with some limited restoring). This makes the tree deep. Making it
wider only works by restoring a rather old snapshot and using this for
an "alternative" run.
> On 07/01/2014 02:23 PM, Klaus Espenlaub wrote:
>> not sure if this is something worth fixing, as a huge number of
>> snapshots causes trouble elsewhere (stack overflows). That's the reason
>> why the snapshot nesting depth has been limited to 250 with VirtualBox
>> On 30.06.2014 18:28, a. wrote:
>>> So we are trying to get a problem with VirtualBox working. We are
>>> heavily using the snapshot feature and hit a wall, as we probably have
>>> exceeded the maximum possible amount of active snapshots. As each
>>> Snapshot is written into the VBOX-file as a descendent of previous
>>> Snapshot, at one point, the nesting exceeds 256 entries - which is an
>>> arbitrary limit set by the libxml2. When trying to register a VM which
>>> has too many Snapshots, vboxmanage registervm fails with an error.
>> I've checked the code of libxml 2.6.31, and the only occurrence of
>> XML_PARSE_HUGE is in some documentation. This means your patch will
>> break our builds by referring to an undefined symbol.
>> Actually I never saw trouble parsing big XML files with the libxml2
>> version we're using in all our builds, 2.6.31. It sets the variable
>> xmlParserMaxDepth to 1024, which is plenty.
>>> We tried to work around this issue simply by adding the corresponding
>>> XML_PARSE_HUGE flag in the function that calls to the xml-library. We
>>> also created a patch:
>>> --- virtualbox-4.1.12-dfsg.orig/src/VBox/Runtime/r3/xml.cpp
>>> +++ virtualbox-4.1.12-dfsg/src/VBox/Runtime/r3/xml.cpp
>>> @@ -1500,7 +1500,7 @@ void XmlMemParser::read(const void* pvBu
>>> NULL, //
>>> encoding = auto
>>> - XML_PARSE_NOBLANKS | XML_PARSE_NONET)))
>>> + XML_PARSE_NOBLANKS | XML_PARSE_NONET | XML_PARSE_HUGE)))
>> XML_PARSE_HUGE was apparently introduced with version 2.7.3, and thus
>> this needs proper version checking to make it an acceptable patch.
>>> throw XmlError(xmlCtxtGetLastError(m_ctxt));
>>> @@ -1630,7 +1630,7 @@ void XmlFileParser::read(const RTCString
>>> NULL, // encoding
>>> = auto
>>> - XML_PARSE_NOBLANKS |
>>> + XML_PARSE_NOBLANKS |
>>> XML_PARSE_NONET | XML_PARSE_HUGE)))
>>> throw XmlError(xmlCtxtGetLastError(m_ctxt));
>>> We successfully rebuild the source with that flag usign
>> Would only compile with a new enough libxml2...
>>> That successfully resolved the previous error,
>>> however we now have the following problem by VBoxManage when trying to
>>> register the VM in question:
>>> vboxmanage registervm /mnt/storage/vm-vbox
>>> VBoxManage: error: Code NS_ERROR_CALL_FAILED (0x800706BE) - Call to
>>> remote object failed (extended info not available)
>>> Context: "OpenMachine(Bstr(a->argv).raw(), machine.asOutParam())" at
>>> line 90 of file VBoxManageMisc.cpp
>>> Unfortunately, we have no idea how to work with that error.
>> That's the mentioned stack overflow elsewhere in the API. When XPCOM is
>> used (i.e. on all platforms besides Windows), the stack of each thread
>> is limited to 128K, to maximize the scalability. That's not a lot...
>>> This is the VBoxSVC.log:
>>> VirtualBox (XP)COM Server 4.1.12_Ubuntu r77245 linux.amd64 (Jun 30 2014
>>> 16:40:09) release log
>>> 00:00:00.200 nspr-2 Loading settings file "/mnt/storage/vm.vbox" with
>>> version "1.12-linux"
>> Rather clear hint that the processing of this settings file fails after
>> the XML parsing step.
>>> Any help would be immensely appreciated.
>> Do you truly have a use case which can't exist without such a deep
>> snapshot nesting (the total snapshot count is still unlimited)? It costs
>> some efficiency to have data spread across so many disk images.
More information about the vbox-dev