VirtualBox

Opened 14 years ago

Closed 12 years ago

#6936 closed defect (fixed)

High CPU consumption for multi-processor Windows guests

Reported by: Max Zinal Owned by:
Component: guest smp Version: VirtualBox 3.2.4
Keywords: Cc:
Guest type: Windows Host type: Linux

Description

Our install is:

  • 8-core 2-processor x86-based server with Xeon E5335 CPUs, 32 Gbytes of RAM.
  • Debian/GNU Linux 5 (Lenny) as a host system
  • Windows XP Pro SP3 as a guest system

We tried the following VirtualBox releases, with the same results: 3.1.8, 3.2.2, 3.2.4

We get high CPU consumption on host (up to 80-120%, measured by top) even when our guest system is idle. The effect goes away when we switch the system to a single CPU and turn off IO APIC (and move to single-processor kernel, of course).

We do not see same effect with exactly the same virtual machine on a notebook with Intel Core i7 processor, running Windows 7. So this problem might be processor- or even operating system-specific.

One of the interesting side-effects is that even when CPU usage on guest is near maximum (we use 4-core guest. and all cores are pretty busy at boot time), the CPU usage on host is about 170-180%.

The whole problem makes it pretty hard to use VirtualBox on that server, so we are abandoning our plans to buy VBox license for that host (at least until we can find a workaround).

Attachments (5)

VBox.log (63.4 KB ) - added by Max Zinal 14 years ago.
VirtualBox logfile
cpuinfo (5.2 KB ) - added by Max Zinal 14 years ago.
CPU information for the server
1-2010-06-09-16-16-09.log (64.0 KB ) - added by Max Zinal 14 years ago.
VirtualBox log from Windows host system (on the same server)
CpuTests.zip (37.2 KB ) - added by Max Zinal 14 years ago.
VirtualBox logfiles from 3 test runs (1,2 and 4 cpus)
VBox-3.2.4-debug.log (56.0 KB ) - added by Max Zinal 14 years ago.
VBox.log from VirtualBox OSE 3.2.4 built in debug mode

Download all attachments as: .zip

Change History (24)

by Max Zinal, 14 years ago

Attachment: VBox.log added

VirtualBox logfile

by Max Zinal, 14 years ago

Attachment: cpuinfo added

CPU information for the server

comment:1 by Max Zinal, 14 years ago

This seems pretty much like tickets #6928, #6583, #6814, #6204, and, specifically, #4392.

It seems pretty obvious that there is some sort of design or implementation defect in VirtualBox multi-core guest support.

I also have to add, that we tried both 64-bit and 32-bit host systems (the latter using kernel compiled with large memory support) - no difference at all.

comment:2 by Sander van Leeuwen, 14 years ago

Does it happen with a 1 CPU Windows XP guest with IO-APIC turned on?

Your CPU supports the extension required for properly dealing with IO-APIC overhead, so that can't be the problem (rules out #4392).

comment:3 by ToddAndMargo, 14 years ago

Just adding myself to the Cc: list

comment:4 by Sander van Leeuwen, 14 years ago

Component: VMMguest smp

in reply to:  2 comment:5 by Max Zinal, 14 years ago

Replying to sandervl73:

Does it happen with a 1 CPU Windows XP guest with IO-APIC turned on?

Your CPU supports the extension required for properly dealing with IO-APIC overhead, so that can't be the problem (rules out #4392).

I will check that on Friday. I promise :)

For now I can add that VirtualBox 3.2.4 works just fine on that server when we installed Windows 2003 x64 Server Standard Edition on it as a host system (temporarily, just for a test). We see no load at all on host system when our guest system is idle (as expected, of course).

At the same time there are some strange benchmarking results: we have a relatively large RAR archive, and we tried to unpack it in several configurations:

  • inside the VM on a notebook with Core i7 and Windows 7
  • inside the VM on that server
  • inside the host system on a notebook
  • inside the host system on the server (Linux OS)
  • inside the host system on the server (Windows OS)

Here are approximate timing numbers, pretty strange for me:

VM / notebook: 95 seconds VM / server: 81 seconds notebook: 63 seconds server (Linux): 16 seconds server (Windows) 15 seconds

Perhaps this performance difference for archive unpack operation on a server and on a VM inside that server is somehow connected with high resource usage on Linux host?

comment:6 by Sander van Leeuwen, 14 years ago

I very much doubt that. I hope you realize benchmarking in a VM is a lot more complicated than on a real machine. Did you run the unrar once or several times? Keep in mind that the dynamic disk image of the VM might be expanded during heavy file io. File expansion can be very expensive.

SATA vs IDE can make a difference as well as host cache on/off.

comment:7 by now, 14 years ago

The same happens with a 64-bit Ubuntu host and 32-bit XP. stracing the virtualbox process shows an endless: read(17, 0x116a0a4, 4096) = -1 EAGAIN (Resource temporarily unavailable)

read(17, 0x116a0a4, 4096) = -1 EAGAIN (Resource temporarily unavailable)

read(30, 0x11fdbb4, 4096) = -1 EAGAIN (Resource temporarily unavailable)

poll([{fd=18, events=POLLIN}, {fd=25, events=POLLIN|POLLPRI}, {fd=27, events=POLLIN|POLLPRI},

{fd=28, events=POLLIN|POLLPRI}, {fd=29, events=POLLIN|POLLPRI}, {fd=30, events=POLLIN}, {fd=31, events=POLLIN}, {fd=32, events=POLLIN}, {fd=33, events=POLLIN}, {fd=17, events=POLLIN}, {fd=34, events=POLLIN}], 11, 0) = 0 (Timeout)

read(17, 0x116a0a4, 4096) = -1 EAGAIN (Resource temporarily unavailable)

read(17, 0x116a0a4, 4096) = -1 EAGAIN (Resource temporarily unavailable)

read(30, 0x11fdbb4, 4096) = -1 EAGAIN (Resource temporarily unavailable)

Maybe this helps.

in reply to:  6 comment:8 by Max Zinal, 14 years ago

Replying to sandervl73:

I very much doubt that. I hope you realize benchmarking in a VM is a lot

more complicated than on a real machine.

I know that.

Did you run the unrar once or several times? Keep in mind that the dynamic disk image of the VM might be expanded during heavy file io. File expansion can be very expensive.

I know that. Unpacking was done multiple times with approximately the same results. Disk image size was stable during tests.

SATA vs IDE can make a difference as well as host cache on/off.

Host cache is turned on.

We use IDE drives for compatibility with other virtualization systems. If SATA/IDE matters so much in that case (powerful host running a single relatively small guest) , then I vote for a defect in IDE emulation.

by Max Zinal, 14 years ago

Attachment: 1-2010-06-09-16-16-09.log added

VirtualBox log from Windows host system (on the same server)

in reply to:  2 comment:9 by Max Zinal, 14 years ago

Replying to sandervl73:

Does it happen with a 1 CPU Windows XP guest with IO-APIC turned on?

Here are the results from 3 tests with the same virtual machine:

  • with one virtual CPU
  • with two virtual CPUs
  • with four virtual CPUs

IO APIC have been turned on for all tests.

All tests were performed when guest system was near idle (1-2% CPU load according to Windows XP Task Manager). I've changed CPU count in VM settings, restarted VM, waited for several minutes for things to settle up, and then captured CPU usage counters.

Host CPU usage was measured using 'top' utility, for appropriate VBoxHeadless process, with 5-seconds update interval.

In general host CPU usage was jumping about some averages, with rare relatively high peaks.

VCPU countHost CPU usage, %Host CPU peaks, %
14-830
210-2035
455-95120

I don't know how to properly interpret these numbers, but I know very well that I can't start more that 3-5 of *idle* 4-CPU VMs on that server. It makes me pretty sad :(

by Max Zinal, 14 years ago

Attachment: CpuTests.zip added

VirtualBox logfiles from 3 test runs (1,2 and 4 cpus)

comment:10 by Sander van Leeuwen, 14 years ago

Looks like the VT-x feature to reduce APIC overhead isn't working for some reason. The capability bit is set, but your measurements suggest it doesn't have the expected result. Quite strange. Do you have the latest BIOS installed for your server?

in reply to:  10 comment:11 by Max Zinal, 14 years ago

Replying to sandervl73:

Looks like the VT-x feature to reduce APIC overhead isn't working for some reason. The capability bit is set, but your measurements suggest it doesn't have the expected result. Quite strange. Do you have the latest BIOS installed for your server?

Installed the latest firmware updates package from Intel (2010-03-06). Still the same picture: idle guest, busy host.

Just for the record, here is the system description:

  • Motherboard: Intel S5000PAL
  • System: Intel SR2500 based
  • CPUs: 2 x Intel Xeon E5335, 2.0 GHz
  • Firmware revisions: BIOS: 99, BMC: 66, FRUSDR: 48

comment:12 by Max Zinal, 14 years ago

Perhaps I can collect some additional data for the analysis?

For now we installed VMWare Server 2.0.2, and it's SMP support just works (although limited to two virtual CPUs).

comment:13 by Sander van Leeuwen, 14 years ago

I can send you a debug build (.run installer only though) or you could build OSE yourself. The debug statistics will show the cause of the performance problem.

by Max Zinal, 14 years ago

Attachment: VBox-3.2.4-debug.log added

VBox.log from VirtualBox OSE 3.2.4 built in debug mode

comment:14 by Max Zinal, 14 years ago

We have build debug version of VirtualBox OSE 3.2.4, and I attached a log file from its run. I'm not pretty sure it's useful for anything, IMHO there's nothing new there in comparison to the old (non-debug) log.

It seems that high CPU usage on host system is mostly caused by even small amount of IO inside the virtual machine. Pure CPU-bound programs (like SuperPI or other benchmarks - both integer and floating point) - seem to perform well, with almost not slowdown.

On the opposite side, if some program in the virtual machine attempts to perform IO (e.g. WinRAR, file copy operations, etc.), than we see very slow performance inside the guest (file copy speed about 2-3 MBytes/sec) and pretty high CPU load on host.

Pretty strange.

comment:15 by ToddAndMargo, 14 years ago

I am not seeing anything in 3.2.8 that addresses this. Am I correct?

in reply to:  15 comment:16 by Max Zinal, 14 years ago

Replying to ToddAndMargo:

I am not seeing anything in 3.2.8 that addresses this. Am I correct?

Yes, we see same picture with 3.2.8.

comment:17 by Sander van Leeuwen, 13 years ago

You've added the wrong log file. The debug build creates another file in the directory where you launch the VM. That file contains the statistics that can shed some light on your problem.

comment:18 by Frank Mehnert, 12 years ago

Still relevant with VBox 4.1.6?

comment:19 by Frank Mehnert, 12 years ago

Resolution: fixed
Status: newclosed

No response, closing.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use