VirtualBox

Ticket #1125 (closed defect: fixed)

Opened 6 years ago

Last modified 3 years ago

VBoxManage and status of virtual machine are out of sync

Reported by: steroid_rambo Owned by:
Priority: critical Component: other
Version: VirtualBox 3.0.2 Keywords: vboxmanage
Cc: Guest type: other
Host type: other

Description

I wrote a little program that monitors the virtual machines and restarts them if they are not running. this is done by capturing the console output of vboxmanage.

In the beginning vboxmanage reports the correct status, but after a few hours, vboxmanage reports 'aborted' or 'powered off', even when the machine is running correctly! Once in such a state, vboxmanage keeps on giving the wrong result.

see attached screenshot.

sample test program + code available on request

Attachments

VBoxManage_InvalidStatus2.jpg Download (188.8 KB) - added by steroid_rambo 6 years ago.
VBoxLostState.log Download (33.9 KB) - added by ccaim 5 years ago.

Change History

Changed 6 years ago by steroid_rambo

comment:1 Changed 6 years ago by frank

So you are running VBoxManage in a loop, right?

comment:2 Changed 6 years ago by steroid_rambo

Yes, in a loop of 30 sec, I have 2 vms to monitor

comment:3 follow-up: ↓ 4 Changed 6 years ago by frank

Your host system please (is it Ubuntu?)

comment:4 in reply to: ↑ 3 Changed 6 years ago by steroid_rambo

Replying to frank:

Your host system please (is it Ubuntu?)

Yes, the host OS is Ubuntu 7.04, Feisty Fawn

the vm's are

  • windows 2003 (windows home server edition)
  • windows XP

comment:5 Changed 6 years ago by sandervl73

  • Status changed from new to closed
  • Resolution set to fixed

Try again with 1.6.x please and reopen if necessary.

comment:6 Changed 5 years ago by mjlucas

  • Status changed from closed to reopened
  • Resolution fixed deleted

This is still happening in 2.0.6. The machine is running happily as you can access the machine through SSH/Web etc, however VBoxManage lists the machine as aborted and you cannot connect via the VBoxHeadless VRDP server (it connects and disconnects straight away, the log files show the connection/disconnection but no other errors)

It is as if the COM service cannot talk to the Virtual Machine any more and therefore thinks that it has died. This occurs regularly for us (we have a similar monitoring script that checks the status of the machines)

Host is Redhat EL5 update 2. Guests are either Windows XP or CentOS5

comment:7 Changed 5 years ago by frank

Please could you check version 2.2.2 as well?

comment:8 Changed 5 years ago by ccaim

We have got the same problem and it still persists with VirtualBox 2.2.4. Host system is CentOS 5.3 final (Kernel: 2.6.18-128.1.10.el5, x86_64).

Hardware: Fujitsu PRIMERGY RX330, 2x AMD Opteron 2344 HE, 16 GB DDR2-RAM ECC, 3x 500 GB SAS, 7.200upm, Hardware RAID 5.

Guests include several machines Windows XP, Windows Server 2003, CentOS 3.4, CentOS 4.6, CentOS 5.3, SLES 10.

It seems that VirtualBox reverts the respective machine to its last "shutdown" state (i.e. "powered off" or "saved"), although it is actually still running.

The problem occurs after running VirtualBox' command-line tools, presumably VBoxManage, but not reproducible.

comment:9 Changed 5 years ago by mjlucas

Apparently a race condition was fixed with changeset 20282 which could lead to the VBoxSVC loosing communication with the VirtualBox process.

The fix is in VB3_Beta1 but not in 2.2.4, I'm testing it under heavy load (many virtual machines, running VBoxManage every 9 seconds [ monitoring scripts ] ) and all seems good so far, however sometimes it takes weeks of this heavy load before I hit a snag.

comment:10 Changed 5 years ago by frank

  • Status changed from reopened to closed
  • Resolution set to fixed

Closing now, please reopen if the problem persists with version 3.0.2.

comment:11 Changed 5 years ago by ccaim

  • Status changed from closed to reopened
  • Resolution fixed deleted

The problem still persists in version 3.0.2r49928. The issue is possibly related to the amount of virtual machines as it reoccurred when running 10 to 15 machines in total and making use of the command-line tools.

comment:12 Changed 5 years ago by michael

  • Version changed from VirtualBox 1.5.4 to VirtualBox 3.0.2

Please give version 3.0.4 a try.

comment:13 Changed 5 years ago by ccaim

The problem still persists in version 3.0.4r50677, still not reproducible and still we suspect the amount of running machines to be related to it. Any chances of getting the issue fixed within the near future? It really interferes with production.

comment:14 Changed 5 years ago by ccaim

We now loosing state of machines permanent; we are running 4 to 6 machines; overnight 1 to 3 of them lost state and are shown as power-off or aborted (I think they fall back in the state they had bevore starting?), but they are still running as vbox headless processes in background without problems; they are visible in the internal network and fully usable; but because it is not possible to have control over them with any vbox manage command it prevents us to use vbox in a procuction environment; its an annoying bug and still exists in many versions.

Overnight there are no vbox manage commands; they just loose state after any time.

I will add a VBOX.log from a machine which lost state to aborted just last night.

If you need any additional information just write.

Changed 5 years ago by ccaim

comment:15 Changed 4 years ago by frank

  • Cc ntanghe@… removed

VBox 3.0.12 contains a fix which could be related to your problem. Please verify.

comment:16 Changed 4 years ago by frank

  • Status changed from reopened to closed
  • Resolution set to fixed

No response, closing. Before you consider to re-open this ticket make sure to test the latest VBox release first (currently VBox 3.2.6).

comment:17 Changed 3 years ago by eb-moll

  • Status changed from closed to reopened
  • Resolution fixed deleted

I'm having this problem too...

Virtualbox Version 3.2.10 Host: Debian testing amd64 Guest Debian testing i686

I started the virtual machine ca. 2 months ago and it is still running (accessible via ssh, tomcat responds normally) but 'vboxmanage showvminfo "Server" | grep State' yields 'State: powered off (since 2010-10-21T13:03:56.000000000)'

I have only one virtual machine configured and running (since the number of machines was suspected as a possible cause).

Sorry for making a wild uninformed guess, but maybe it has something to do with aptitude scrambling vbox files when updating? I updated the Host (and Guest) regularly and probably on 2010-10-21T13:03 (according to auth.log, I was logged in as root at that time)

comment:18 follow-up: ↓ 19 Changed 3 years ago by codeslingercompsalot

If the goal is to find out if your vm's are still running, it is hugely more efficient to do the following: VBoxManage list runningvms

as far as this bug goes, I have not encountered it in vbox 3.2.8 so I think it's probably safe to conclude that the problem is fixed?

If you think that your install of vbox is invalid due to mixed updates, then you ought to completely remove vbox (not your machines) and do a clean install. I am sure that you can appreciate that any bug reports filed against a "suspected to be bad" install are automatically bogus...

As a further thought... if a machine is already running headless and you request to start it headless, then no harm is done; the machine will continue to run, and the start request will return an ignorable error. so this problem should not prevent you from running your script as long as you give it a suitable delay between checks. The other thing to make sure of is that your script is running as the correct user, this is critical. Maybe there is a problem with your script?? I've got a similar program that I wrote in php and it has run for months at a time without errors or memory leaks.

comment:19 in reply to: ↑ 18 Changed 3 years ago by eb-moll

as far as this bug goes, I have not encountered it in vbox 3.2.8 so I think it's probably safe to conclude that the problem is fixed?

It occured in 3.2.10 for me, so I guess its not fixed?

If you think that your install of vbox is invalid due to mixed updates, then you ought to completely remove vbox (not your machines) and do a clean install. I am sure that you can appreciate that any bug reports filed against a "suspected to be bad" install are automatically bogus...

I don't suspect my install to be bad (it's still in use here) but if apt doesn't stop VM's while updating vbox, I could imagine "live" files, i.e. files that are in use by a virtual machine, to go out of sync or something. (I don't know enough about vbox to know if such files even exist, hence the "wild uninformed guess" disclaimer; Should've kept silent on that :|)

The other thing to make sure of is that your script is running as the correct user, this is critical. Maybe there is a problem with your script?? I've got a similar program that I wrote in php and it has run for months at a time without errors or memory leaks.

My "script" really does only this: 'vboxmanage showvminfo "Server" | grep State' (I run it by hand), and running it as the incorrect user gives an vbox error message

comment:20 Changed 3 years ago by codeslingercompsalot

so the question becomes, what is different between your computer where it fails and my computer where it works?? I'm running a similar script on ubuntu 10.04 host and win2k guests on vbox 3.2.8, and never saw this specific problem.

have you tired: VBoxManage list runningvms ? does it show them as not running?

can you try shutting down all of your vm's and doing a clean install of 4.0.4? Yes, you are right, if you upgrade a program that is in use, the original program continues to run, you don't actually start using the new version of that program until you stop and restart all the vms. rebooting the host would not be such a bad idea either if you want to be certain of what your environment is. Doing an update on a running program on *nix is seldom a problem, it is much smarter about this then is windows, but none-the-less there are times when even *nix has to be rebooted.

for the purpose of getting your script usable... a work-around for this problem would be to test for a successful ssh login etc.

comment:21 follow-up: ↓ 22 Changed 3 years ago by codeslingercompsalot

one bug that I have encountered is a mismatch in the behaviour when a vm is run headless vs when a machine is run through the vbox console. for your script to work, be sure to launch the vm in headless mode.

comment:22 in reply to: ↑ 21 Changed 3 years ago by eb-moll

The problem didn't reappear since I filed the bug (in fact it occured only once), so it's not a problem for production use for me; Sorry if my initial post was misleading. I just filed the bug report because the underlying bug still seems to be unfixed. I know that it's near impossible to track a bug that is not reproducible, nevertheless the information might be valuable.

comment:23 Changed 3 years ago by mjlucas

This bug is when VBoxSVC crashes while VMs are running. The VMs running continue to run happily however they don't have a external interface to them anymore so you can't communicate with them through VBoxManage etc. New VMs will execute a new VBoxSVC and will work as normal but VBoxSVC doesn't re-connect to existing running VMs.

This bug should be closed as it doesn't refer to any one bug. As I said previously my issue was fixed after changeset 20282 and also various other fixes along the way. Specific bug reports should be opened for any reproducible crashes of VBoxSVC

comment:24 Changed 3 years ago by frank

  • Status changed from reopened to closed
  • Resolution set to fixed

You are right, when VBoxSVC crashes the VM continues to run, at least for a while. There were some improvements done in the past but VBoxSVC should not crash of course. And I agree that specific tickets should be opened in that case.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use