Ticket #4501 (closed defect: fixed)
Host hangs/freezes when zones are booted
|Reported by:||tjobbins||Owned by:|
|Version:||VirtualBox 3.0.2||Keywords:||host freeze|
I'm experiencing a show-stopping problem with running VirtualBox 3.0.2 on Solaris hosts. VBox 3 is the latest in a line of releases I've failed to get working reliably on Solaris - various versions of 2 would also hang or panic my host, though with different symptoms to the problem below.
The issue I am finding is that with a VBox guest running, my host will freeze/hang if zones are running on the Solaris host. The problem exists on both Solaris 10 U5 and Solaris 10 U7, and I have tried on two different servers - one running new Intel Nehalem X5570 processors (dual processor, quad core), and another running Intel E7320 processors (quad processor, quad core.)
The symptoms of the freeze have varied slightly. On my Solaris 10 U5 box (4xE7320 processors), it will simply freeze completely including at the console. On the Solaris 10 U7 (X5570 processors), I lost remote network access but was then sometimes able to login at the console - however it would then freeze completely a couple of minutes later.
In all cases, the box will freeze and not panic. Nothing related to VBox is logged in /var/adm/messages or elsewhere, except for the following message:
Jul 12 14:20:49 host-us8 vboxdrv: [ID 937234 kern.notice] CPUMSetGuestCpuIdFeature: Disabled x2APIC
Note: The above message is logged always, and is not related to the crashing - i.e it appears also when the host does not freeze.
Nothing is logged in VirtualBox's log file after initial bootup. There are no log lines near the time of the host freeze/hang.
I've attached a sample VBox.log. The log shows messages no later than 2 minutes after the VM booted - the VM caused a host freeze about 15 minutes later. There are no loglines near the time of the freeze.
Guest: Windows XP SP3 guest with 1.25GB of RAM, 12MB video ram, one network adapter in NAT mode, USB disabled, CD mounted from ISO, floppy disabled, Intel VT enabled. Installed on a fixed size VDI disk of 20GB.
Host: Both Solaris 10 U5 and Solaris 10 U7. Virtualbox installed onto ZFS filesystem. Virtualbox running in the Global zone. Virtualbox running as root.
Solaris 10 U5 hardware config: 4 x Quad Core Intel Xeon E7320 processors (16 total cores). 32GB ram. 2 x 500GB SATA disks in UFS root/boot mirror. 2 x 500GB SATA disks in ZFS filesystem (this is where VBox is installed.) 2 x Intel NIC using igb driver.
Solaris 10 U7 hardware config: 2 x Quad Core Intel Xeon X5570 processors (8 real cores + HyperThreading = 16 virtual cores.) 36GB ram. 2 x 73GB SAS drives in ZFS root/boot mirror. 2 x 250GB SATA drives in ZFS filesystem (this is where VBox is installed). 1 x Intel NIC using e1000g driver.
Solaris 10 U5 using kernel 138889-03.
Solaris 10 U7 using kernel 139556-08.
The following tests/situations describe the problem:
- Solaris 10 U7: Installed VBox 3.0. Box has 61 zones running. Created a new VM, and got 90% of the way through installing before host freezes.
- Solaris 10 U7: Rebooted, and then disabled all zones. Re-installed XP VM and used it successfully for 3 hours.
- Solaris 10 U7: Booted zones with XP VM still running, After about 30 zones were booted, host box hangs again.
- Solaris 10 U5: Transferred Virtualbox config and XP VM to Solaris 10 U5 box. Installed VirtualBox 3.0.2. Box has 29 zones running. Booted XP VM in Headless mode. Box hangs within 1 minute of XP VM starting up.
- Solaris 10 U5: Disabled all zones. Booted XP VM in Headless mode. Used VM successfully for 1 hour.
- Solaris 10 U5: With XP VM still running in Headless mode, I started booting zones with a 1 minute pause between each boot. Confirmed that box hung after 19 zones were booted.
- Solaris 10 U5: Rebooted, booted XP VM again in Headless mode. Repeated zone boot test with 1 minute delay. This time managed to boot all 29 zones. Box continued running for a further 5 minutes before freezing.
- Solaris 10 U5: Networking test: it occurred to me that one side effect of booting zones was the addition of new virtual network interfaces (e1000g0:0, e1000g0:1, etc). So to isolate this, I did the following test: Disabled all zones. Booted XP VM in Headless mode. Ran a script to create a new NIC interface every minute, with ifconfig igb0:1 plumb up 192.168.10.1 .. ifconfig igb0:2 .. etc. Ran the test until 50 interfaces were created, without any crash. So the box hang is not related to virtual network interfaces.
So I have seen that:
a) Problem exists equally and seemingly identically on both Solaris 10 U5 and Solaris 10 U7.
b) Problem occurs both when VBox is started when zones are already running, and if zones are booted after VBox is running.
c) The exact number of running zones required is not fixed, it has been between 19 and 30 in my tests.
d) I cannot say 100% if it is actually the process of booting/running a zone that causes the problem, or whether having zones booted causes some other activity that causes the problem. But there is a direct, replicatable connection between zones booted and the host crashing.