VirtualBox

Opened 15 years ago

Closed 15 years ago

Last modified 14 years ago

#4486 closed defect (fixed)

VirtualBox 3.0.2 hard hangs Solaris 10u6 kernel when running HW assisted VMs -> fixed in SVN/3.0.6

Reported by: Paul Owned by:
Component: VMM Version: VirtualBox 3.0.2
Keywords: Cc:
Guest type: other Host type: Solaris

Description

After installing the 3.0.2 update (and rebooting to make sure driver structures were consistent), attempting to run previously built and running (under 2.x) virtual machines (Windows 2008, Linux Ubuntu 8.10, Centos) will eventually hard hang the Solaris 10 host kernel it is running upon.

Environment: S10u6 patched to 138889-08 on AMD Opteron 2376, 4Gb memory. VirtualBox 3.0.2 r49928 for SunOS.

Started Win2k8 and Ubuntu guests headless (with HW virtualization enabled). System hard hangs after about 3 hours runtime. Could not break console to MDB (system was booted w/debugger enabled). System does not character echo nor respond to network (ping). No disk activity observed.

Same hang happened almost immediately after starting 3 guest VMs concurrently. All VMs had HVM enabled with one CPU allocated to each guest (IO Apic enabled).

This configuration had been running solidly under VB 2.2.2 for months uptime.

No entries were noted in either the VM logfile(s) nor syslog. Guests could be resumed after host is rebooted. Currently reverting to 2.4.x.

Change History (14)

comment:1 by Sander van Leeuwen, 15 years ago

priority: majorblocker

comment:2 by bauer40, 15 years ago

I had the same problem (hard hangs) when running 3.0.0 on S10_u6. To resolve, I upgraded to 3.0.2 and S10_u7 + most recent recommended patches, ran a 48h stress test and nothing failed.

Now I updated our production server (s10u7 + patches) to 3.0.2 and run productive on that. I will add my experience in about a week.

in reply to:  2 comment:3 by bauer40, 15 years ago

Replying to bauer40:

I forgot to say that I'm using software VMM.

in reply to:  1 comment:4 by bauer40, 15 years ago

Replying to sandervl73:

I had a hard hang of the entire physical machine using VBox 3.0.2 on S10_u7 with latest recommended patches cluster.

Now my system is running using the kernel debugger - so if it hangs again I hopefully can generate a kernel crash dump

in reply to:  1 comment:5 by bauer40, 15 years ago

Replying to sandervl73:

OK, I had another hard hang of my physical server. But the hang was so hard that event F1-A (fall into Kernel debugger) did not work.

So I'm unable to deliver a crash dump of the hanging system, sorry. I downgraded to 2.2.4 to have my production stable again.

comment:6 by herf, 15 years ago

Having the same issue with Linux (guest) under Solaris snv118 (host). Crashed hard twice during heavy I/O to Linux guest (heavy I/O over samba and NFS).

Turned off hardware virtualization support and will see if this helps.

in reply to:  6 comment:7 by bauer40, 15 years ago

Replying to herf:

Having the same issue with Linux (guest) under Solaris snv118 (host). Crashed hard twice during heavy I/O to Linux guest (heavy I/O over samba and NFS).

Turned off hardware virtualization support and will see if this helps.

My tests with 3.0.2 on S10_u7 were without hardware virtualisation, and it hang hard. You will probably see the same again.

comment:8 by Ramshankar Venkataraman, 15 years ago

Could you provide info on what the guest was doing or roughly how long before you get a hang? It would help us in reproducing the problem.

in reply to:  8 comment:9 by bauer40, 15 years ago

Replying to ramshankar:

Could you provide info on what the guest was doing or roughly how long before you get a hang? It would help us in reproducing the problem.

My system was up for about four to five days, running the following VBoxes:

  • Debian 5.0, web Proxy, net and disk activity
  • Debian 5.0, Mail server, net and disk activity
  • Debian 5.0, plain OS, idle
  • Debian 5.0, ultra-low-volume Webserver, idle
  • Solaris 10, plain OS, idle
  • Windows XP, Database server, mostly idle (two users)

I had the same uptime with 3.0.0. Saturday, I upgraded to 3.0.0, and on wednesday the system freezes, but then three or four times on that very day.

Same when I installed 3.0.2. Installation on Saturday, freeze one on thursday, freeze two on friday, then downgraded to the latest 2.x version.

I assume Ticket 4618 is the same, so you might want to combine them.

Peter

in reply to:  description comment:10 by Pete Durst, 15 years ago

Replying to tallpaul:

After installing the 3.0.2 update (and rebooting to make sure driver structures were consistent), attempting to run previously built and running (under 2.x) virtual machines (Windows 2008, Linux Ubuntu 8.10, Centos) will eventually hard hang the Solaris 10 host kernel it is running upon.

Environment: S10u6 patched to 138889-08 on AMD Opteron 2376, 4Gb memory. VirtualBox 3.0.2 r49928 for SunOS.

I've had some similar experiences, with 3.0.0 to 3.0.4, where the system will hard lock on me. I have tried this with both the Solaris 10u6 and 10u7 host systems on an Ultra 20 (8G ram) and a Dell p380 (6G ram). In all cases, the problem occurs when trying to jumpstart a Solaris Guest system. I created a guest VM using the CLI interface (via a script) and then start it up after logging into the GUI (JDS). After about 340M downloads (flash archive S10u6 or S10u7), the system locks up hard. I haven't been able to do kmdb panic on it, as the systems are remote and I have to rely on other folks to reset them for me. The problem here is consistent and always happens this way. I did have these 2 systems working ok with the 2.2.4 system, so am suspicious that it has something to do with the 3.0.x versions.

Today, I made another discovery. I had ganged 2 switches together for the test lab I was working in and although the u20 wasn't attached to it, it seemed to be affected by it. Once I removed the switch, the s10 jumpstarts for the guest systems finished without problem. These were an 8 port GB switch and a 24 port GB switch. I moved the u20 to another lab and connected it to a 24 100baseT switch, which also had a couple of 8 port 100baseT switches attached to it. The same problem occurred on this network. I moved the system to another lab, with a clean 100baseT switch and it works fine there. At this point, I can't attest to how long they will stay alive, however for now at least, they are working. This would seem to indicate a network layer issue with VBox, at least to my thinking it does.

Hope that helps...

Pete

comment:11 by Sander van Leeuwen, 15 years ago

Summary: VirtualBox 3.0.2 hard hangs Solaris 10u6 kernel when running HW assisted VMsVirtualBox 3.0.2 hard hangs Solaris 10u6 kernel when running HW assisted VMs -> fixed in SVN/3.0.6

Try again with the 3.0.6 beta released yesterday.

comment:12 by Paul, 15 years ago

First impression of 3.0.6Beta1: (on snv121 - Nevada; Intel Q9650 8Gb)

Does not crash Nevada when running. Adversely affects scheduling of other solaris processes; java applet running in Firefox 3.5.1 browser became unresponsive and required killing firefox to unwedge, other processes such as Gnome window manager were very sluggish when guest was running. Back to normal after pausing guest (WinXP 1Gb mem)

Will try S10u7 tonight.

comment:13 by Paul, 15 years ago

Solaris 10 u7 still hangs under stress test with 3.0.6b1.

3 non-global zones, 3 VB 3.0.6b1 VMs in one NGZ (ubuntu, centos, win2k3), one VB 3.0.6b1 VM in global zone (Win7). All active I/O and CPU.

Hang is different in that interrupt code now appears to be working (character echo still works as well as ICMP PING responses come from host). Was unable to break to kernel debugger though (and obtain crash dump).

comment:14 by Frank Mehnert, 15 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use