Opened 12 years ago
Last modified 9 years ago
#10157 new defect
Adding vCPUs degrades network performance. — at Initial Version
Reported by: | james little | Owned by: | |
---|---|---|---|
Component: | network | Version: | VirtualBox 4.1.8 |
Keywords: | Cc: | ||
Guest type: | other | Host type: | Linux |
Description
Observed on both Ubuntu 11.04 (2.6 kernel) and Mac OS X 10.6 hosts, both with quad core Intel i7 CPUs. Below is pasted from my forum post: https://forums.virtualbox.org/viewtopic.php?f=7&t=47136.
I decided to run some network throughput benchmarks (mainly interested in max packets per second), and have noticed a large discrepancy when assigning one CPU to a VM vs 2 or more CPUs. My setup (all 64-bit): Host: Ubuntu 11.04 2.6.38-13-generic #53-Ubuntu SMP, 3 GHz Intel i7 950 (4 phys cores + HT), 12GB RAM, Ethernet: Intel 82574L Gigabit Guest: Ubuntu 11.10 3.0.0-12-server kernel, 1GB Ram assigned VirtualBox version: 4.1.8
Following are the results of the basic testing I performed using netserver/netperf against a bridged network interface (bridged to above Intel device). The following commands were run on the Guest against its local interface (not the loopback): netserver -4 (starts an ipv4 tcp/udp server). netperf -H <IP_address_of_eth0> -t TCP_CRR (runs a TCP connect/request/response transaction benchmark)
Single-CPU VM ~17-18k transactions per second
2-CPU VM ~5k TPS
2-CPU VM with eth0 interrupts and netserver/netperf all pinned to the same core ~7k TPS Confirmed v. low scheduling interrupts during benchmark (watching /proc/interrupts)
2-CPU VM with 2nd core disabled via hotplugging Disabled the second cpu with: echo '0' > /sys/devices/system/cpu/cpu1/online and confirmed via /proc/interrupts and other system tools. ~8.5k TPS
Also worth noting that on the host system, the same test yields around 26k TPS. netfilter/conntrack is disabled on both host and guest.
So even with the second cpu disabled I'm seeing around a 50% performance degradation vs the single-cpu VM. The results with more than 2 CPUs were very similar to the 2-CPU scenario.
I would like to understand why this is the case (I'm certainly no virtualization expert); are additional extensions/emulations loaded when starting a multicore guest?
Update I repeated the same test on an OS X 10.6 host on similar architecture (quad core intel i7) and the results were the same, also on VBox 4.1.8.
I decided to extend the test to something CPU bound and ran a Linpack benchmark (single thread), but the results are unaffected by number of vCPUs (which is good). And so I also ran a a disk read benchmark using hdparm, and this was also unaffected, so this seems to be confined to network performance for now.
vbox log w/ 2 vCPUs configured.