[vbox-dev] HGCM calls slow on multicore guest (linux)
michal.necasek at oracle.com
Tue Jun 9 16:38:07 UTC 2015
In general, some slowdown is expected. It's normal that scheduling in
SMP systems has higher latency. The short story is that
context-switching back and forth on a single CPU/core is faster than
waking up another CPU, going to sleep, and being woken up again when the
work is done.
What exactly is going on, I don't know. It might be interesting to
know whether the delay is introduced between initiating the request on
the guest side and executing it on the host, or between executing it on
the host and returning control back to the guest again.
As for copying vs. page lists... that's something you'd have to
benchmark on specific hardware. 1K is small enough that there's probably
no clear-cut answer.
And another note: Trying to run real-time workloads in a VM is
treading on very thin ice. There are no latency guarantees and the VM is
subject to host scheduling like any other application. Though I assume
you're aware of that already.
On 6/8/2015 4:15 PM, Höhn Thomas wrote:
> I wrote a VBox host-service for my company (since 2 years) using the HGCM API to do data exchange (1K-64K size) between host/guest. We have chosen HGCM since it's the only host/guest API provided so far and we don't wanna do changes with later VBox versions. (Note: data exchange via network, e.g. host-only iface is no option for our use-case)
> The host service on the Linux guest is written as Lib using VBoxGuestLib calls. For the Windows 7 host I build an ExtensionPack (to load HGCM module during runtime, since VBox 4.3.0). I also managed Dll signing on the Win host (separate story) which runs fine for VBox > 4.3.14. The Linux guest runs a PREEMPT RT kernel.
> Problem for us now is that we need near "real-time" host/guest communication in future due to a customer request.
> With only one core for the Linux VM the guest side host-calls have a reasonable (monotonic) execution time between 40-160us when measured with clock_gettime.
> But if I spend >=2 cores for the VM the HGCM calls on the guest take significant more time, e.g. for 2 cores the guest-side host call time then rises to 200us - 1ms. For 4 or 6 cores (on a true 6-core CPU) its even worse. This degrades our host/guest communication speed badly (running in a <20ms task cycle).
> On host side the host calls (measured in svcCall) show fixed exec time - sure, as they do not depend on guest VM core number. So timing issue is on guest side only.
> I made ftrace measurements on the guest to record the syscalls and I see the sys_ioctl start and end and what happens between but it doesn't lighten me up why it takes so long.
> For the HGCM data (mostly ~1K size) I use type VMMDevHGCMParmType_LinAddr_In and LinAddr_Out. In HGCMInternal.cpp there is a lot of copying for this data type. Should I use another, e.g. PageList (with less copy overhead)? What is LinAddr_Lock_In/Out for?
> If anyone has an idea why HGCM calls get slower with more cores and possibly how to fix that or how to make the HGCM calls faster I would be grateful.
> Thanks for advice,
> Thomas Höhn
> DR. JOHANNES HEIDENHAIN GmbH
> 83301 Traunreut, Deutschland
> Registergericht: Traunstein / Registry Court: HRB 275 - Sitz / Head Office: Traunreut
> Aufsichtsratsvorsitzender / Chairman of Supervisory Board: Rainer Burkhard
> Geschäftsführung / Management Board: Thomas Sesselmann (Vorsitzender / Chairman),
> Michael Grimm, Hubert Ermer
> E-Mail Haftungsausschluss / E-Mail Disclaimer: http://www.heidenhain.de/disclaimer
> vbox-dev mailing list
> vbox-dev at virtualbox.org
More information about the vbox-dev