VirtualBox

Opened 13 years ago

Closed 11 years ago

#8582 closed defect (fixed)

multiple headless VMs: VBox daemons stop without waiting for the last VM to shut down

Reported by: Régis Desgroppes Owned by:
Component: other Version: VirtualBox 4.0.4
Keywords: Cc:
Guest type: Windows Host type: Linux

Description (last modified by Frank Mehnert)

Hi,
For continuous integration purposes, I'm running multiple concurrent headless VMs thanks to the vboxapi python bindings. This is done on demand to release resources as soon as possible.
What I observe:

  • everything goes fine when the first started VM is also the last to shut down.
  • but, if the first started VM shuts down while other VMs are still running, all remaining sessions abort (vboxapi raises a xpcom.Exception: 0x80004004, Operation aborted, NS_ERROR_ABORT) without any sort of notification to running VMs. There's even nothing in any of the VBox.logs (see attached one).


What I guess is that the first started VM "owns" the daemons so, as a workaround, I currently launch one of VBox daemons manually (/usr/lib/virtualbox/VBoxSVC --daemonize) in the hope the daemons are then owned by no peculiar VM. It still works fine after a few days.
Is there a better/recommended way to handle this?
Did I guess well and -if yes- would it be possible to replace the ownership by a reference counting?
Many thanks,
Régis

Attachments (3)

VBox.log.3 (46.5 KB ) - added by Régis Desgroppes 13 years ago.
prematurely ended VM log
VBox-A.log (90.9 KB ) - added by Régis Desgroppes 13 years ago.
logs of machine A (normal termination)
VBox-B.log (47.0 KB ) - added by Régis Desgroppes 13 years ago.
logs of machine B (abnormal termination)

Download all attachments as: .zip

Change History (11)

by Régis Desgroppes, 13 years ago

Attachment: VBox.log.3 added

prematurely ended VM log

comment:1 by Frank Mehnert, 13 years ago

Your guess is not correct. If the VBoxSVC daemon terminates when the first VM shuts down then it most probably crashes and this is a bug. Could you try if you are still able to reproduce this problem with VBox 4.0.6?

comment:2 by Klaus Espenlaub, 13 years ago

Terminating VBoxSVC was always controlled by reference counting, since VirtualBox 1.0. What you get must be actually a really evil kind of crash since starting with VirtualBox 4.0 there is code in VBoxHeadless which detects VBoxSVC crashes and terminates the VM cleanly, and if this happens there will be a line in the log indicating this kind of emergency termination was triggered.

No idea what's going wrong in your case, however it's certainly not expected behavior. When I tried what you described with the current state of the 4.0 branch it simply worked. I also double checked the correct behavior of the emergency termination by sending VBoxSVC SIGSEGV, and it worked flawlessly.

VBoxSVC should be always daemonized, i.e. be a subprocess of pid 1. I assume you start the VMs through IMachine::launchVMProcess, and in this case the VMs will be subprocesses of VBoxSVC.

From the symptoms it could also be a VBoxXPCOMIPCD crash, in the sense that a console file descriptor is "leaked" from the client starting it (same could happen for VBoxSVC actually), i.e. the process inherits a file descriptor which causes process termination if the other end closes the socket/pipe/...

Many guesses, little information.

by Régis Desgroppes, 13 years ago

Attachment: VBox-A.log added

logs of machine A (normal termination)

by Régis Desgroppes, 13 years ago

Attachment: VBox-B.log added

logs of machine B (abnormal termination)

comment:3 by Régis Desgroppes, 13 years ago

Reproducible with 4.0.10. Scenario with 2 VMs - let's call them A and B:

  1. [2011-07-06 09:20:36,492]
    A starts (triggered by a call to Python vboxapi: machine.launchVMProcess)
  2. [2011-07-06 09:24:28,001]
    B starts (idem)
  3. [2011-07-06 09:36:01,469]
    A stops (triggered by a call to Python vboxapi: session.console.powerButton)
  4. [2011-07-06 09:36:10,347]
    B unexpectedly aborts. Current call to Python vboxapi throws an xpcom.Exception: 0x80004004 (Operation aborted (NS_ERROR_ABORT))

comment:4 by Klaus Espenlaub, 13 years ago

Effectively you've proven that something (either the VM process or VBosVC) crashes. This is the cause for the error code you get.

comment:5 by Régis Desgroppes, 13 years ago

I don't know what I have proven. I'm just sure it occurs in the given scenario: A starts-B starts-A stops-B aborts. I assert it's the shutdown of A that leads B to abort. What I don't know it's if it's a direct or side effect. 100% reproducible in our environment: host is an 64 bit Ubuntu Maverick with Virtualbox 4.10, 24 CPUs total. Guests are 32 bit Windows XPs, 8 CPUs assigned.

comment:6 by Régis Desgroppes, 12 years ago

Still active with 4.1.6.

in reply to:  2 comment:7 by Régis Desgroppes, 12 years ago

Replying to klaus:

Many guesses, little information.

What else do you need?

comment:8 by Frank Mehnert, 11 years ago

Description: modified (diff)
Resolution: fixed
Status: newclosed

Please reopen if still relevant with VBox 4.2.12.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use