[vbox-dev] questions about number of CPUs and host scheduling behaviour
Nikolay Igotti
igotti at gmail.com
Fri May 27 02:55:28 PDT 2011
Hi Bayard,
22.05.2011 15:13, Bayard Bell ?????:
> What I've got at this point is: if I try to do the compilation with 2
> virtual CPUs, the system falls over. If I try to do the compilation
> with 4-6 virtual CPUs, I never get significantly above 200% CPU
> utilisation. After 6 CPUs or so, I start to see stability problems in
> the host, possibly because Solaris tries to sync clocks across CPUs,
> which shows up more clearly if you build a debug kernel off the below.
> Here's the info on the workload I've been trying most frequently:
>
> https://www.illumos.org/projects/illumos-gate/wiki/How_To_Build_Illumos
>
> Talking about it with other developers, the feedback is that VBox has
> problems where VMWare doesn't. Could you give a go and let me know
> what kinds of result you see?
Sorry, but could do much with profiling/debugging, as no longer work
on VBox fulltime. General thinking, although, is that guest SMP may have
some performance issues,
some of them are definitively bound to host scheduler behaviour. One
trick I could offer, is have shared folder shared among multiple 1 vCPU
Solaris VMs, and
deploy some form of distributed across VMs build. This could be
workaround for some performance issue.
Another approach is to try different host OS.
Thanks,
Nikolay
> On 11 May 2011, at 12:29, Nikolay Igotti wrote:
>
>> Bayard Bell wrote:
>>>> CPU isn't that easily "given" to the VM, as RAM pages, for example.
>>>> VirtualBox internally need to run few threads doing disk/network
>>>> IO. Same situation with host OS too, so essentially some
>>>> experiments is the best way to figure out how many vCPUs is
>>>> reasonable to give to the guest to get best performance.
>>>>
>>>
>>> Any suggestions as to how to go about that methodically?
>> Well, just try some representative subset (10-20 mins of compilation)
>> with 1,2,3,4... vCPUs and see the result :).
>> Could be easily automated with vboxshell and guest commands execution
>> facility.
>>
>>
>>> What I know is that the run queue seems to back up to the point of
>>> crushing the host if I provide only two vCPUs, while with 4 vCPUs, I
>>> only seem to get consumption of 2 actual CPUs. I've got a slight
>>> further wrinkle, insofar as the default behaviour of the build
>>> environment is to look at the number of CPUs and amount of memory
>>> and decide for itself what the appropriate level of parallelism is,
>>> although I can work around this by setting a fixed value before
>>> experimenting with CPU count. Just to give this a bottom line, if I
>>> haven't mentioned this previously: I've got a compile job that
>>> normally takes at most few hours on comparable bare metal, and it's
>>> taking several days under VBox. Resolving this is the difference
>>> between being able to get acceptably slower performance under VBox
>>> and needing to sort myself out with a separate system.
>>>
>> Is project you're compiling open source? This could make analysis
>> simpler.
>>
>>>>
>>>> With compilation, especially if you compile a lot of small files,
>>>> significant part of load is fork/exec performance (and so, VMM in
>>>> the guest), and of course, IO does matter too.
>>>>
>>>
>>> The I/O is trivial, but what I'm gathering is that the CPU overhead
>>> of the system calls is increased considerably. I don't see a lot of
>>> fork and exec load, but what I'm wondering is whether time spent in
>>> the kernel would actually be relatively longer, such that relatively
>>> lightweight system calls on a normal host would add up to a
>>> considerably higher percentage of CPU time in a virtual environment.
>>>
>> Syscalls per se aren't affected much by virtualization, but
>> privileged operations they perform sometimes are.
>> Generally, this need deeper analysis, and you may want to try running
>> same guest on different host OS (ideally with
>> the same hardware), to see if some host specifics presented.
>>
>> Also no sure if OSX is best OS to run SMP load in general.
>>
>>>> Don't think you really need that. As VBox doesn't do explicit gang
>>>> scheduling, some assistance from host scheduler on that would be
>>>> helpful, not explicit assignment of CPU affinity. In theory, good
>>>> scheduler shall gang schedule threads with the same address space
>>>> even without additional hints, as this will likely increase
>>>> performance. Not sure if OSX does that, although.
>>>>
>>>
>>> Thanks for that info. I'll see if there's any documentation or
>>> source to satisfy my curiosity on this point. It might also be
>>> useful to see what DTrace can tell me. Does VBox have its own DTrace
>>> probes to help with these kinds of problems?
>>>
>>>
>> Don't think VBox has much of probes on its own, but even OS traces
>> could be sufficiently useful.
>>
>>
>> Nikolay
>>
>
>
> _______________________________________________
> vbox-dev mailing list
> vbox-dev at virtualbox.org
> http://vbox.innotek.de/mailman/listinfo/vbox-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.virtualbox.org/pipermail/vbox-dev/attachments/20110527/2ff448c4/attachment-0001.html
More information about the vbox-dev
mailing list