VirtualBox

Ticket #8628 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

Linux guests hang

Reported by: syntheticpp Owned by:
Priority: critical Component: other
Version: VirtualBox 4.0.4 Keywords: hangs reset react
Cc: Guest type: Linux
Host type: Windows

Description

Host : Windows 64/32 bit, Core i7 / Xeon

Guest : several Linux versions (Ubuntu 32/64, Opensuse 32/64)

Usb additions : installed and not installed

Host Disk cache: enabled and disabled

Mainboard supports USB3.

Multiple cores are enabled in the VM.

NVIDIA low-end and Quadro

For all combination above the Linux guests hangs after an undefined time. It starts with the disappearing of the keyboard, then it is possible to do some clicks with the mouse, but then it hangs completely and only a hard reset helps (Host+R). Waiting does not help.

This bug was reproduced on different Systems, and makes VirtualBox unusable for production systems.

Attachments

VBox.log Download (110.7 KB) - added by syntheticpp 3 years ago.
VBox_freeze_01.log Download (54.4 KB) - added by syntheticpp 3 years ago.
VBox_freeze_02.log Download (93.3 KB) - added by syntheticpp 3 years ago.
VBox_freeze_03.log Download (93.3 KB) - added by syntheticpp 3 years ago.
VBox_freeze_04.log Download (93.8 KB) - added by syntheticpp 3 years ago.
VBox_4.0.7-71576-Win_freeze_01.log Download (93.5 KB) - added by syntheticpp 3 years ago.
VBox_4.0.7-71576-Win_freeze_02.log Download (93.4 KB) - added by syntheticpp 3 years ago.
VBox_4.0.7-71576-Win_freeze_03.log Download (96.4 KB) - added by syntheticpp 3 years ago.
VBox_4.0.7-71576-Win_freeze_04.log Download (93.3 KB) - added by syntheticpp 3 years ago.
systeminfo_on_freece.txt Download (8.2 KB) - added by syntheticpp 3 years ago.
make.txt Download (83.4 KB) - added by syntheticpp 3 years ago.

Change History

Changed 3 years ago by syntheticpp

comment:1 Changed 3 years ago by syntheticpp

3d acceleration was enabled.

comment:2 Changed 3 years ago by syntheticpp

Happened with Linux 2.6.35 and 2.6.37.

I also found this bug reports: #7803 #7514 #7858

comment:3 Changed 3 years ago by syntheticpp

It still freezes with VirtualBox 4.0.6.

Maybe you could reproduce it this way:

  • enable 8 Prozessors (i7: 4 real & 4 HT)
  • dynamic growing disk, Host Cache enabled.
  • rest all default.
  • install Ubuntu 11.4 64Bit from CD-iso.

Then while installing, it freezes.

There was another vm running with heavy downloading.

comment:4 Changed 3 years ago by frank

This could be a duplicate of #8773 and #8826. Could you install the test build of #8826 and check if it fixes your issue?

comment:5 Changed 3 years ago by syntheticpp

Before I try the test build of #8826, here a "good" news: I could reproduce it. All VBox_freeze_0* logs are produced this way:

Install Ubuntu (as described above), and when clicking the button "Install Now" start in a other virtual Linux a make process which uses all cores (here: make -j8). Evertytime the installation freezes.

Changed 3 years ago by syntheticpp

Changed 3 years ago by syntheticpp

Changed 3 years ago by syntheticpp

Changed 3 years ago by syntheticpp

comment:6 Changed 3 years ago by syntheticpp

Test build of #8826 (VBox_4.0.7-71576-Win) does not fix this bug. Every attempt to install Ubuntu fails. See the VBox_4.0.7-71576-Win logs.

Changed 3 years ago by syntheticpp

Changed 3 years ago by syntheticpp

Changed 3 years ago by syntheticpp

Changed 3 years ago by syntheticpp

comment:7 Changed 3 years ago by buffyg

Naive question from a fellow user: your test case seems to be that you are trying to install a VM while you're generating load on already running guest sufficient to consume all available CPU resources, and the installation process for the second guest is not responsive. What's the expected behaviour?

I may be misunderstanding what you're reporting because some details aren't clear. Could you be precise about exactly what Windows you're running (Windows 7? 2008 Server R2?)? What resources does the host have and how are those resources allocated across the set of running guests? How does resource utilisation look from the host perspective when guest undergoing install hangs?

comment:8 Changed 3 years ago by syntheticpp

Even when I remove the load the installation doesn't proceed. With the test build I even had a freeze without any load.

I have this bug on Windows 7 Prof. 32bit and Windows 7 Home 64bit as host, and tested it with severeal guest Linux distributions on W7Home/64.

The _freeze_ logs were produced on W7Home/64 with i7-870 and 16GB. One time the freezing box had 12GB all the other testcases were with 2GB RAM. But the details could be found in the logs.

And there were always enough resources left on the host (several GB unused RAM). I stopped any load when I thought that it again hangs and then waited a bit.

comment:9 Changed 3 years ago by buffyg

I'm not sure that this is a problem in VirtualBox per se. Installers aren't generally written with an expectation that they'll have to deal with contending load. If you really need to do with creating new OS images on a system that's already carrying load, it would seem to make far better sense to clone an existing VDI and use DHCP to give it its basic personality. I think you should hold it to one side that you're encountering a problem that isn't fully characterised that may be a VirtualBox problem, while on the other side, you're saying this is holding you back from production because you want to be able to install VMs while another VM is hammering the host. If this is for a production, a better approach to installation would seem to be not having to work out whether it's the virtualisation or the installation software that has problems holding up under fire: image the OSes you need to install and provision their contents as a VBox storage operation.

I'm still a little unclear on resource allocation from the host to the guests: the problem seems more likely to be CPU contention, but the details you provided are memory.

comment:10 Changed 3 years ago by syntheticpp

There is enough memory, there is no load (means CPU is idle), but the installation doesn't proceed.

Installing under load is NOT my production scenario, I could only reproduce the freeze this way!

I describe it here because the freeze while installing has maybe the same reason like the freezes later. These freezes later make it impossible to use VBox in a production system.

comment:11 follow-up: ↓ 12 Changed 3 years ago by buffyg

Ok, something is a bit unclear here: you say that you've got no load (where?), but your scenario to reproduce is to fire off a compiler run that would be parallelised across all the CPUs in the other running guest? Perhaps you could attach output with metrics data for all of the running VMs and the host [e.g. "VBoxManage metrics collect --period 15 --list '*'"]?

You've mentioned the install under load as a way to reproduce the problem, but what's the original scenario?

comment:12 in reply to: ↑ 11 Changed 3 years ago by syntheticpp

You've mentioned the install under load as a way to reproduce the problem, but what's the original scenario?

make -j8

comment:13 follow-ups: ↓ 14 ↓ 16 Changed 3 years ago by buffyg

Could you attach metrics, as per the above suggestion and, as a separate attachment, the output of make -j8 -n (an invocation of make isn't readily analysable without some clarity as to what operations the Makefile generates). What happens if you use less parallelism than the number of virtual CPUs (say -j4 or -j6)?

comment:14 in reply to: ↑ 13 Changed 3 years ago by syntheticpp

Replying to buffyg:

Could you attach metrics, as per the above suggestion and, as a separate attachment, the output of make -j8 -n (an invocation of make isn't readily analysable without some clarity as to what operations the Makefile generates). What happens if you use less parallelism than the number of virtual CPUs (say -j4 or -j6)?

See the .txt. Do you believe me now? "Ubuntu test" is death.

Couldn't you reproduce it with the install recipe?

Changed 3 years ago by syntheticpp

comment:15 Changed 3 years ago by syntheticpp

"Ubuntu test" is configures with 2048 MB.

comment:16 in reply to: ↑ 13 Changed 3 years ago by syntheticpp

Replying to buffyg:

Could you attach metrics, as per the above suggestion and, as a separate attachment, the output of make -j8 -n (an invocation of make isn't readily analysable without some clarity as to what operations the Makefile generates). What happens if you use less parallelism than the number of virtual CPUs (say -j4 or -j6)?

It's makefile in userspace, nothing special.

Changed 3 years ago by syntheticpp

comment:17 Changed 3 years ago by buffyg

It's not that I don't believe you're having a problem, it's just that I'm not sure the installer problem reproduces the original problem. There's only a single sample in the metrics. What do the metrics look like before, during, and after? I'd expect to have at least three intervals in the sample.

comment:18 Changed 3 years ago by syntheticpp

Have you tried to reproduce it by your own?

comment:19 Changed 3 years ago by buffyg

Here's what I get from your logs at this point. You've got a Windows host with 16 GB of memory and 8 logical CPUs. You start up one Linux guest with 2 GB of memory and 8 virtual CPUs. Then you start up another Linux guest with 12 GB of memory and 8 CPUs. I don't see it as reasonable to expect the system to hold up with that level of resource oversubscription. Even with one guest set to use eight virtual CPUs out of 8 physical, you're going to have VBox have to fight itself to get other threads scheduled for VM overhead, like doing I/O. I'm not saying that there's not a bug there, just that I wouldn't expect to respond well even if there were no bugs.

My advice would be to find a more stripped-down test case than this, particularly when it comes to CPU provisioning. I'm just another user, but looking through the logs, this isn't something I would try to reproduce. If you don't like that feedback, see what you get from the VBox team.

comment:20 Changed 3 years ago by syntheticpp

The is no load any more when it hangs, this shows the info file. It is irrelevant how big the load was before the freeze.

comment:21 Changed 3 years ago by syntheticpp

Reproducible with 4.0.8

comment:22 Changed 3 years ago by lefticus

I have exactly the same scenario for the sequence of events that you describe.

First printable characters fail to type, then key sequences like alt-tab to switch applications, then the mouse, then the Linux guest needs to be rebooted.

I have:

Phenom II X4, 8G of system Ram, 2G of guest ram, 4 virtual cores on guest. Host: Windows Ultimate 64bit Guest: Ubuntu Linux > 10.04.

This is the key, I have never once experienced this problem on Ubuntu 10.04, but have experienced it consistently on every version of Ubuntu after this.

I am able to install the OS and run it for a time, but it is basically unusable with this problem.

My hunch is that it is some interaction between guest additions and the kernel version, based on what I've seen. As to why it fails only for the 2 of us, I have no idea.

Our hardware platforms are about as completely different as they can be (I have AMD graphics).

Every other guest I run works perfectly, host CPU usage does not seem to be a factor.

comment:23 Changed 3 years ago by lefticus

I have now removed the guest additions kernel modules, but left the X11 driver installed. The problem still happens.

On an interesting note: it seems to occur much faster if the VirtualBox configuration is set for "enable absolute pointing device"

I will keep investigating if I can narrow down what option fixes the problem

comment:24 Changed 3 years ago by syntheticpp

Interesting, I already speculated it maybe depends on the CPU virtualization, having it also on AMD disproves this.

Last time it freezes while heavy network/disk access: a 'git svn fetch' should completely clone a svn repository.

comment:25 follow-up: ↓ 26 Changed 3 years ago by codeslingercompsalot

see bug #8511

comment:26 in reply to: ↑ 25 Changed 3 years ago by lefticus

Replying to codeslingercompsalot:

see bug #8511

Reading through the other bug report, I don't think this one is the same. This one has a very characteristic degradation of responsiveness of the system. It is very bizarre. First one application in the guest will start ignoring keyboard input, but alt-tab will let you change apps, and a second application will still accept it. After another dozen or so keypresses / clicks the applications all refuse to respond, but the VM seemes to still be running.

I've now narrowed it down specifically to the X11 drivers that ship with the guest additions. I've changed every aspect of the system except for performance related ones (that is: audio, lan, 'absolute pointing device', RAM). Finally, I removed the kernel modules but left the X11 drivers in.

Finally, after removing the X11 drivers, the system is fully stable.

I'm going to reinstall guest additions and this time only remove the x11 input driver. I'm 95% sure that that is where the error lies. That should allow us to at least narrow down this bug, and I'll cross post the results to your bug, to see if that solution works for you as well.

comment:27 Changed 3 years ago by frank

Please retry with VBox 4.0.10. It contains some timer fixes which should affect especially Linux SMP guests.

comment:28 Changed 3 years ago by syntheticpp

Seems it's much better with 4.0.10!

But I will do more tests.

comment:29 Changed 3 years ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Please reopen if appropriate.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use