VirtualBox

Ticket #8582 (closed defect: fixed)

Opened 3 years ago

Last modified 12 months ago

multiple headless VMs: VBox daemons stop without waiting for the last VM to shut down

Reported by: rdesgroppes Owned by:
Priority: major Component: other
Version: VirtualBox 4.0.4 Keywords:
Cc: Guest type: Windows
Host type: Linux

Description (last modified by frank) (diff)

Hi,
For continuous integration purposes, I'm running multiple concurrent headless VMs thanks to the vboxapi python bindings. This is done on demand to release resources as soon as possible.
What I observe:

  • everything goes fine when the first started VM is also the last to shut down.
  • but, if the first started VM shuts down while other VMs are still running, all remaining sessions abort (vboxapi raises a xpcom.Exception: 0x80004004, Operation aborted, NS_ERROR_ABORT) without any sort of notification to running VMs. There's even nothing in any of the VBox.logs (see attached one).


What I guess is that the first started VM "owns" the daemons so, as a workaround, I currently launch one of VBox daemons manually (/usr/lib/virtualbox/VBoxSVC --daemonize) in the hope the daemons are then owned by no peculiar VM. It still works fine after a few days.
Is there a better/recommended way to handle this?
Did I guess well and -if yes- would it be possible to replace the ownership by a reference counting?
Many thanks,
Régis

Attachments

VBox.log.3 Download (46.5 KB) - added by rdesgroppes 3 years ago.
prematurely ended VM log
VBox-A.log Download (90.9 KB) - added by rdesgroppes 3 years ago.
logs of machine A (normal termination)
VBox-B.log Download (47.0 KB) - added by rdesgroppes 3 years ago.
logs of machine B (abnormal termination)

Change History

Changed 3 years ago by rdesgroppes

prematurely ended VM log

comment:1 Changed 3 years ago by frank

Your guess is not correct. If the VBoxSVC daemon terminates when the first VM shuts down then it most probably crashes and this is a bug. Could you try if you are still able to reproduce this problem with VBox 4.0.6?

comment:2 follow-up: ↓ 7 Changed 3 years ago by klaus

Terminating VBoxSVC was always controlled by reference counting, since VirtualBox 1.0. What you get must be actually a really evil kind of crash since starting with VirtualBox 4.0 there is code in VBoxHeadless which detects VBoxSVC crashes and terminates the VM cleanly, and if this happens there will be a line in the log indicating this kind of emergency termination was triggered.

No idea what's going wrong in your case, however it's certainly not expected behavior. When I tried what you described with the current state of the 4.0 branch it simply worked. I also double checked the correct behavior of the emergency termination by sending VBoxSVC SIGSEGV, and it worked flawlessly.

VBoxSVC should be always daemonized, i.e. be a subprocess of pid 1. I assume you start the VMs through IMachine::launchVMProcess, and in this case the VMs will be subprocesses of VBoxSVC.

From the symptoms it could also be a VBoxXPCOMIPCD crash, in the sense that a console file descriptor is "leaked" from the client starting it (same could happen for VBoxSVC actually), i.e. the process inherits a file descriptor which causes process termination if the other end closes the socket/pipe/...

Many guesses, little information.

Changed 3 years ago by rdesgroppes

logs of machine A (normal termination)

Changed 3 years ago by rdesgroppes

logs of machine B (abnormal termination)

comment:3 Changed 3 years ago by rdesgroppes

Reproducible with 4.0.10. Scenario with 2 VMs - let's call them A and B:

  1. [2011-07-06 09:20:36,492]
    A starts (triggered by a call to Python vboxapi: machine.launchVMProcess)
  2. [2011-07-06 09:24:28,001]
    B starts (idem)
  3. [2011-07-06 09:36:01,469]
    A stops (triggered by a call to Python vboxapi: session.console.powerButton)
  4. [2011-07-06 09:36:10,347]
    B unexpectedly aborts. Current call to Python vboxapi throws an xpcom.Exception: 0x80004004 (Operation aborted (NS_ERROR_ABORT))

comment:4 Changed 3 years ago by klaus

Effectively you've proven that something (either the VM process or VBosVC) crashes. This is the cause for the error code you get.

comment:5 Changed 3 years ago by rdesgroppes

I don't know what I have proven. I'm just sure it occurs in the given scenario: A starts-B starts-A stops-B aborts. I assert it's the shutdown of A that leads B to abort. What I don't know it's if it's a direct or side effect. 100% reproducible in our environment: host is an 64 bit Ubuntu Maverick with Virtualbox 4.10, 24 CPUs total. Guests are 32 bit Windows XPs, 8 CPUs assigned.

comment:6 Changed 2 years ago by rdesgroppes

Still active with 4.1.6.

comment:7 in reply to: ↑ 2 Changed 2 years ago by rdesgroppes

Replying to klaus:

Many guesses, little information.

What else do you need?

comment:8 Changed 12 months ago by frank

  • Status changed from new to closed
  • Resolution set to fixed
  • Description modified (diff)

Please reopen if still relevant with VBox 4.2.12.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use