VirtualBox

Opened 8 years ago

Last modified 8 years ago

#15557 new defect

Solaris + SMP: Performance decreases as you add processors to guest

Reported by: KSJohn Owned by:
Component: guest smp Version: VirtualBox 5.0.22
Keywords: smp performance slow ioctl vboxdrv device Cc:
Guest type: Windows Host type: Solaris

Description

I found that with VirtualBox 5.0.x, when you have 1-2 processors assigned to a VM, performance is quite fast. But the more processors you add, the slower the guest VM becomes. I've gone up to 16 processors, which is very slow to boot and extremely laggy after logging in (even Task Manager stopped responding.)

For a scientific test, I used the same host and guest, with a combination of vbox versions and processors assigned to the guest.

The host:

  • O/S: Solaris 11.2.15.4.0
  • Hardware: Oracle X4-2L
  • CPUs: Quantity 2 of E5-2690 v2, 10 cores, 20 threads (Total of 20 cores, 40 threads)
  • RAM: 512GB

The guest:

Below are the timings for how long it takes to boot the guest VM (complete when you see the start button):

VirtualBox 4.3.32:

  • 4 CPU: 0:24
  • 10 CPU: 0:30
  • 16 CPU: 1:06

VirtualBox 4.3.38:

  • 4 CPU: 0:23
  • 10 CPU: 0:30
  • 16 CPU: 0:48

VirtualBox 5.0.4:

  • 4 CPU: 0:22
  • 10 CPU: 1:44
  • 16 CPU: 3:36

VirtualBox 5.0.22:

  • 4 CPU: 0:21
  • 10 CPU: 1:40
  • 16 CPU: 4:15

I'm not 100% sure that this is the cause of the issue. But, by doing "truss", "pstack" and "pfiles" commands, I found that several stack traces that are performing a tremendous amount of ioctl() system calls against the /dev/vboxdrv device. Here are some of the stack traces:

Stack Trace 1:

     ffff80ffbf51edba ioctl    (13, ffffffffc0185687, ffff80ffbe13ddc0)
     ffff80ffa385df0f SUPR3CallVMMR0Ex () + 30f
     ffff80ffa0accad3 _Z19vmR3HaltGlobal1HaltP6UVMCPUjm () + 1e3
     ffff80ffa0acd004 VMR3WaitHalted () + f4
     ffff80ffa0a62d66 EMR3ExecuteVM () + 656
     ffff80ffa0acbbcf _Z25vmR3EmulationThreadWithIdP11RTTHREADINTP6UVMCPUj () + 12f
     ffff80ffa37d8aac rtThreadMain () + 2c
     ffff80ffa38557c1 _Z18rtThreadNativeMainPv () + 51
     ffff80ffbf515ef5 _thrp_setup () + a5
     ffff80ffbf5161a0 _lwp_start ()

Stack Trace 2:

     ffff80ffbf51edba ioctl    (13, ffffffffc0185698, ffff80ffbcbceda0)
     ffff80ffa3862442 SUPSemEventWaitNoResume () + 82
     ffff80ff9fd3d440 _Z15ahciAsyncIOLoopP9PDMDEVINSP9PDMTHREAD () + 330
     ffff80ffa0a87aea _Z15pdmR3ThreadMainP11RTTHREADINTPv () + 6a
     ffff80ffa37d8aac rtThreadMain () + 2c
     ffff80ffa38557c1 _Z18rtThreadNativeMainPv () + 51
     ffff80ffbf515ef5 _thrp_setup () + a5
     ffff80ffbf5161a0 _lwp_start ()

Stack Trace 3:

     ffff80ffbf51edba ioctl    (13, ffffffffc0185698, ffff80ffbc2fcc60)
     ffff80ffa3862442 SUPSemEventWaitNoResume () + 82
     ffff80ffa0b45828 _Z29pdmR3R0CritSectEnterContendedP11PDMCRITSECTmPK15RTLOCKVALSRCPOS () + 98
     ffff80ffa0b45914 PDMCritSectEnter () + a4
     ffff80ffa0a98424 PGMR3PhysReadExternal () + 34
     ffff80ffa0a7a4e8 _Z20pdmR3DevHlp_PhysReadP9PDMDEVINSmPvm () + 78
     ffff80ff9fd0b4f4 _Z17ohciR3ThreadFrameP9PDMDEVINSP9PDMTHREAD () + 254
     ffff80ffa0a87aea _Z15pdmR3ThreadMainP11RTTHREADINTPv () + 6a
     ffff80ffa37d8aac rtThreadMain () + 2c
     ffff80ffa38557c1 _Z18rtThreadNativeMainPv () + 51
     ffff80ffbf515ef5 _thrp_setup () + a5
     ffff80ffbf5161a0 _lwp_start ()

Currently, this issue prevents us from using 5.0.x at all.

Attachments (1)

VBox.log.gz (28.7 KB ) - added by KSJohn 8 years ago.
Log file of v 5.0.22, with 12 processors assigned to the guest

Download all attachments as: .zip

Change History (3)

by KSJohn, 8 years ago

Attachment: VBox.log.gz added

Log file of v 5.0.22, with 12 processors assigned to the guest

comment:1 by KSJohn, 8 years ago

I just saw that 5.0.24 came out yesterday. I tried the same test with 5.0.24, and had significantly better results, but above 10 processors, 5.0.24 is still far slower than 4.3.38:

  • 4 CPU: 0:21
  • 10 CPU: 0:28
  • 12 CPU: 1:59
  • 16 CPU: 2:54

comment:2 by KSJohn, 8 years ago

Today, we downloaded 5.1.6, and ran the tests again. Below are the results:

4 CPU: 0:21
10 CPU: 0:57
12 CPU: 1:36
16 CPU: 7:22

Even at 10 CPUs, the guest VM seems really slow compared to a lower number of CPUs.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use