[vbox-dev] questions about number of CPUs and host scheduling behaviour

Nikolay Igotti nikolay.igotti at oracle.com
Tue May 10 09:39:36 GMT 2011


Hi Bayard,

06.05.2011 19:56, Bayard Bell пишет:
> I've got 8 cores on my system, so I can hand it all over to the guests without sweating it. I'm looking at top, and I don't see any indication that other system load is contending. When I stop other apps running, it's only the amount of CPU idle time in the host that goes down, while the guest maintains the same level of CPU utilisation.
CPU isn't that easily "given" to the VM, as RAM pages, for example.
VirtualBox internally need to run few threads doing disk/network IO. 
Same situation with host OS too, so essentially
some experiments is the best way to figure out how many vCPUs is 
reasonable to give to the guest to get best performance.


> The load I'm running is compilation. There shouldn't be a lot of system time, but the build system I'm using schedules a higher level of parallel jobs than there is CPU, using both CPU count and memory size to determine the maximum number of jobs. What nevertheless seems odd is that when the Solaris guest thinks it's got 3 or 4 threads on CPU, utilisation is half what I'd expect.
With compilation, especially if you compile a lot of small files, 
significant part of load is fork/exec performance (and so, VMM in the 
guest), and of course,
IO does matter too.

> Now, I can imagine a variety of reasons for this, plenty of which I don't properly or at all understand, but looking at CPUPalette.app (I'm not aware of anything on OS X that approximates the functionality of mpstat), it looks like the load on the system is being spread evenly across CPUs.
That's pretty much expected.
>   My very naive reaction to this is that this isn't quite right, that VirtualBox should be trying to maintain processor affinity and pushing the CPU flat-out and not itself being subject to unnecessary additional SMP overhead, which is cumulative with the overhead of the guest.
It's up to host OS scheduler to maintain (soft) affinity of threads the 
way it thinks most reasonable. SMP overhead, such as need for TLB 
shootdown, couldn't be cured by
forcing affinity, affinity would only help with CPU cache entries reuse, 
if some form of address space ID is used (or if switches happens inside 
same address space).

>   (My understanding is that the ability to create CPU affinity in OS X is a bit weak compared to compared to Linux or Solaris [i.e. affinity is between threads and is meant to be defined by applications based on hw.cacheconfig and friends, whereas in Linux and Solaris it can be defined more strictly in terms of processors and processes].)
Don't think you really need that. As VBox doesn't do explicit gang 
scheduling, some assistance from host scheduler on that would be 
helpful, not explicit assignment of
CPU affinity. In theory, good scheduler shall gang schedule threads with 
the same address space even without additional hints, as this will 
likely increase performance.
Not sure if OSX does that, although.


Nikolay

> On 6 May 2011, at 11:17, Nikolay Igotti wrote:
>
>>    Hi Bayard,
>>
>> Question is how do you generate load in guest, and what's are real bottlenecks. Generally, guest SMP maps to multiple
>> threads of execution for guest code, but mind page table synchronization, device access locks and other factors adding
>> overhead in SMP case sometimes more severe than it would be on the real box.
>>
>> Also if your box has just 4 CPUs I wouldn't recommend assign all them to the guest.
>>
>>   Thanks,
>>      Nikolay
>>
>>
>> Bayard Bell wrote:
>>> Anyone?
>>>
>>> On 23 Apr 2011, at 11:50, Bayard Bell wrote:
>>>
>>>
>>>> I've got an OpenSolaris guest that I'm using as a compile server, with Mac OS X Server as the host. I've assigned 4 CPUs to the guest, and the guest in fact sees 4 CPUs. From the host perspective, however, what I see is that the guest never ranges substantially above 200% (or 2 CPU) utilisation, even when the run queue is backed up and 4 processes appear to be on the CPU. I'm comparing the compile times to reference against other configurations, and what I'm seeing in VirtualBox leads me to believe that I'm being presented 4 CPUs but can't actually consume more than 2. I haven't made any apples-to-apples comparison yet, but this nevertheless seems to be able to keep the system running under load that can't  be sustained with only 2 CPUs assigned, which seems to indicate that the benefits of assigning more than 2 CPUs may be more about reducing context switching and CPU migration overhead on the guest than providing the full benefit of increased compute resources (or: IOW words the benefit seems equivalent to provide hyperthreaded virtual CPUs rather than cores).
>>>>
>>>> Is this expected behaviour? I've looked through the documentation and wasn't able to find any information on this. I'm running 4.0.6 and also saw this behaviour on 4.0.4.
>>>>
>>>   ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> vbox-dev mailing list
>>> vbox-dev at virtualbox.org
>>> http://vbox.innotek.de/mailman/listinfo/vbox-dev
>>>
>>






More information about the vbox-dev mailing list