[vbox-dev] questions about number of CPUs and host scheduling behaviour

Nikolay Igotti igotti at gmail.com
Fri May 27 09:55:28 GMT 2011


    Hi Bayard,

22.05.2011 15:13, Bayard Bell ?????:
> What I've got at this point is: if I try to do the compilation with 2 
> virtual CPUs, the system falls over. If I try to do the compilation 
> with 4-6 virtual CPUs, I never get significantly above 200% CPU 
> utilisation. After 6 CPUs or so, I start to see stability problems in 
> the host, possibly because Solaris tries to sync clocks across CPUs, 
> which shows up more clearly if you build a debug kernel off the below. 
> Here's the info on the workload I've been trying most frequently:
>
> https://www.illumos.org/projects/illumos-gate/wiki/How_To_Build_Illumos
>
> Talking about it with other developers, the feedback is that VBox has 
> problems where VMWare doesn't. Could you give a go and let me know 
> what kinds of result you see?
  Sorry, but could do much with profiling/debugging, as no longer work 
on VBox fulltime. General thinking, although, is that guest SMP may have 
some performance issues,
some of them are definitively bound to host scheduler behaviour.  One 
trick I could offer, is have shared folder shared among multiple 1 vCPU 
Solaris VMs, and
deploy some form of distributed across  VMs build. This could be 
workaround for some performance issue.
Another approach is to try different host OS.

    Thanks,
      Nikolay


> On 11 May 2011, at 12:29, Nikolay Igotti wrote:
>
>> Bayard Bell wrote:
>>>> CPU isn't that easily "given" to the VM, as RAM pages, for example. 
>>>> VirtualBox internally need to run few threads doing disk/network 
>>>> IO. Same situation with host OS too, so essentially some 
>>>> experiments is the best way to figure out how many vCPUs is 
>>>> reasonable to give to the guest to get best performance.
>>>>
>>>
>>> Any suggestions as to how to go about that methodically?
>> Well, just try some representative subset (10-20 mins of compilation) 
>> with 1,2,3,4... vCPUs and see the result :).
>> Could be easily automated with vboxshell and guest commands execution 
>> facility.
>>
>>
>>> What I know is that the run queue seems to back up to the point of 
>>> crushing the host if I provide only two vCPUs, while with 4 vCPUs, I 
>>> only seem to get consumption of 2 actual CPUs. I've got a slight 
>>> further wrinkle, insofar as the default behaviour of the build 
>>> environment is to look at the number of CPUs and amount of memory 
>>> and decide for itself what the appropriate level of parallelism is, 
>>> although I can work around this by setting a fixed value before 
>>> experimenting with CPU count. Just to give this a bottom line, if I 
>>> haven't mentioned this previously: I've got a compile job that 
>>> normally takes at most few hours on comparable bare metal, and it's 
>>> taking several days under VBox. Resolving this is the difference 
>>> between being able to get acceptably slower performance under VBox 
>>> and needing to sort myself out with a separate system.
>>>
>> Is project you're compiling open source? This could make analysis 
>> simpler.
>>
>>>>
>>>> With compilation, especially if you compile a lot of small files, 
>>>> significant part of load is fork/exec performance (and so, VMM in 
>>>> the guest), and of course, IO does matter too.
>>>>
>>>
>>> The I/O is trivial, but what I'm gathering is that the CPU overhead 
>>> of the system calls is increased considerably. I don't see a lot of 
>>> fork and exec load, but what I'm wondering is whether time spent in 
>>> the kernel would actually be relatively longer, such that relatively 
>>> lightweight system calls on a normal host would add up to a 
>>> considerably higher percentage of CPU time in a virtual environment.
>>>
>> Syscalls per se aren't affected much by virtualization, but 
>> privileged operations they perform sometimes are.
>> Generally, this need deeper analysis, and you may want to try running 
>> same guest on different host OS (ideally with
>> the same hardware), to see if some host specifics presented.
>>
>> Also no sure if OSX is best OS to run SMP load in general.
>>
>>>> Don't think you really need that. As VBox doesn't do explicit gang 
>>>> scheduling, some assistance from host scheduler on that would be 
>>>> helpful, not explicit assignment of CPU affinity. In theory, good 
>>>> scheduler shall gang schedule threads with the same address space 
>>>> even without additional hints, as this will likely increase 
>>>> performance. Not sure if OSX does that, although.
>>>>
>>>
>>> Thanks for that info. I'll see if there's any documentation or 
>>> source to satisfy my curiosity on this point. It might also be 
>>> useful to see what DTrace can tell me. Does VBox have its own DTrace 
>>> probes to help with these kinds of problems?
>>>
>>>
>>  Don't think VBox has much of probes on its own, but even OS traces 
>> could be sufficiently useful.
>>
>>
>>  Nikolay
>>
>
>
> _______________________________________________
> vbox-dev mailing list
> vbox-dev at virtualbox.org
> http://vbox.innotek.de/mailman/listinfo/vbox-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.virtualbox.org/pipermail/vbox-dev/attachments/20110527/2ff448c4/attachment.html>


More information about the vbox-dev mailing list