VirtualBox

Ticket #9756 (new defect)

Opened 3 years ago

Last modified 6 months ago

delay loop calibration failed for 2.6.32 guest kernels on Intel Atom CPU 330

Reported by: rschmied Owned by:
Priority: major Component: other
Version: VirtualBox 4.1.4 Keywords:
Cc: Guest type: Linux
Host type: Linux

Description (last modified by frank) (diff)

Going from CentOS5 to CentOS6 and the kernel change from 2.6.18 to 2.6.32 breaks the timer calibration and makes the system almost totally unusable (also on other 2.6.32 or newer based distros).

There's no VT-x on the Atom so HPET is turned off. However, I can observe the effect also on a Mac Mini w/ VirtualBox 4. The impact is not as dramatic as the Core2Duo on the Mac provides more performance to begin with. But the delay loop calibration on the Mac is also off (and not showing the value 2.6.18 shows).

I'm running VirtualBox 4.1.4 on a 64-Bit Ubuntu 10.04 w/ ~4G of DRAM and an Intel Atom 330 Dual Core CPU w/ HT

Please see the additional log files with cpuinfo etc on the virtual and the physical systems.

On 2.6.32 for the guest:

Fast TSC calibration failed
TSC: Unable to calibrate against PIT
TSC: HPET/PMTIMER calibration failed.
Marking TSC unstable due to could not calculate TSC khz
Calibrating delay loop... 293.88 BogoMIPS (lpj=146944)
Initializing cgroup subsys cpuacctmce: CPU supports 0 MCE banks
Performance Events: unsupported p6 CPU model 28 no PMU driver, software events only.
weird, boot CPU (#0) not listed by the BIOS.
APIC calibration not consistent with PM-Timer: 247ms instead of 100ms
APIC delta adjusted to PM-Timer: 6254153 (15508853)
Total of 1 processors activated (293.88 BogoMIPS).
Switching to clocksource jiffies
Switching to clocksource acpi_pm
hrtimer: interrupt took 3406704 ns

On 2.6.18 for the guest:

Calibrating delay loop (skipped), value calculated using timer frequency.. 3200.05 BogoMIPS (lpj=1600025)
Intel machine check reporting enabled on CPU#0.
CPU0: Intel(R) Atom(TM) CPU  330   @ 1.60GHz stepping 02
Using local APIC timer interrupts.
WARNING calibrate_APIC_clock: the APIC timer calibration may be wrong.
Brought up 1 CPUs
ACPI: (supports S0 S1 S4 S5<6>Time: acpi_pm clocksource has been installed.

Thanks for looking into this! -ralph

Attachments

logs.zip Download (20.8 KB) - added by rschmied 3 years ago.
VBox.log, configuration of VM (2.6.32), log files of virtual hosts and physical hosts
info.tar Download (110.0 KB) - added by rschmied 20 months ago.
w/ current VBox version (4.1.20)
info-c2d.tar Download (30.0 KB) - added by rschmied 20 months ago.
w/ current VBox version (4.1.20) on different hardware (Mac Mini w/ Intel C2D)

Change History

Changed 3 years ago by rschmied

VBox.log, configuration of VM (2.6.32), log files of virtual hosts and physical hosts

comment:1 Changed 3 years ago by rschmied

Running Linux on VMware (Fusion, different Machine in this case) produces the following log messages in dmesg during CPU initialization on kernel start:

TSC freq read from hypervisor : 2653.482 MHz
Calibrating delay loop (skipped) preset value.. 5306.96 BogoMIPS (lpj=10613928)

The kernel obviously uses some mechanism to realize it's running on virtualized h/w and pulls some additional info thru some hypervisor API.

 Here's a link to the relevant post on LKML, it's quite date and VirtualBox could use the same / similar mechanism as the original poster mentioned: "This patch also adds a hypervisor_get_tsc_freq function, instead of calibrating the frequency which can be error prone in virtualized environment, we ask the hypervisor for it. We get the frequency from the hypervisor by accessing the backdoor port if we are running on VMware. Other hypervisors too can add code to get frequency on their platform to this routine. "

Last edited 20 months ago by rschmied (previous) (diff)

comment:2 Changed 2 years ago by rschmied

Upgraded everything to latest and greatest

  • Host OS Linux asrock 2.6.32-36-server #79-Ubuntu SMP Tue Nov 8 22:44:38 UTC 2011 x86_64 GNU/Linux
  • Virtualbox 4.1.6r74727
  • Guest OS Linux centos6 2.6.32-131.21.1.el6.i686 #1 SMP Tue Nov 22 18:21:07 GMT 2011 i686 i686 i386 GNU/Linux

and it's still the same phenomena. It can be also observed on a Mac Mini w/ an Core2Dua which gives a lot more horsepower compared to the Atom 330 (and thus it's not that obvious) but still can be seen. Was wondering if this is going to be addressed eventually?

Changed 20 months ago by rschmied

w/ current VBox version (4.1.20)

Changed 20 months ago by rschmied

w/ current VBox version (4.1.20) on different hardware (Mac Mini w/ Intel C2D)

comment:3 Changed 20 months ago by rschmied

Upgraded host, guest install ISO and VBox to latest available versions. With a vanilla CentOS 6.3 install disk / ISO and given the config in info.tar the following boot error can be seen:

------------[ cut here ]------------
WARNING: at arch/x86/kernel/apic/apic.c:1304 setup_local_APIC+0x236/0x355() (Not
 tainted)
Hardware name: VirtualBox
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.32-279.el6.i686 #1
Call Trace:
 [<c0455c11>] ? warn_slowpath_common+0x81/0xc0
 [<c0838bae>] ? setup_local_APIC+0x236/0x355
 [<c0838bae>] ? setup_local_APIC+0x236/0x355
 [<c0455c6b>] ? warn_slowpath_null+0x1b/0x20
 [<c0838bae>] ? setup_local_APIC+0x236/0x355
 [<c0ab9c40>] ? APIC_init_uniprocessor+0x9b/0x103
 [<c0ab75d9>] ? disable_smp+0x63/0x83
 [<c0ab776e>] ? native_smp_prepare_cpus+0x175/0x346
 [<c0aa72b9>] ? kernel_init+0x69/0x23a
 [<c0aa7250>] ? kernel_init+0x0/0x23a
 [<c0409fff>] ? kernel_thread_helper+0x7/0x10
---[ end trace a7919e7f17c0a725 ]---
APIC calibration not consistent with PM-Timer: 256ms instead of 100ms
APIC delta adjusted to PM-Timer: 6251117 (16060050)


Not seen in the same configuration on a different host (VBox 4.1.20 on a Mac Mini with Intel Core2 Duo CPU). But the Timer calibration seems to be way off on that platform as well even though this box has HardwareVirtEx (and the Atom box does not).

I would really like to see at least some confirmation here since incorrect timer calculation potentially affects a lot more users...

Thanks, -ralph

Last edited 20 months ago by rschmied (previous) (diff)

comment:4 Changed 20 months ago by rschmied

Please also note my edit on comment #1 above... with a link to existing code for timer calibration w/ Linux for the VMware hypervisor.

comment:5 Changed 20 months ago by frank

  • Description modified (diff)

You are right, this patch adds some hypervisor-specific code to the Linux kernel to make the detection of the TSC frequency more reliable and actually this is the only way to fix this problem. On the other hand, the code you mention is specific to VMware and emulating this functionality would mean to make VirtualBox appear a bit like VMware, ie. implement at least parts of the VMware backdoor. This can have negative consequences to other software which now could think that it runs straight on VMware and could use certain features of the backdoor which are not implemented.

I think we have to go this way sooner or later but this needs a lot of tests.

comment:6 Changed 20 months ago by frank

Looking at the  current code more closely I think we cannot go this way as VirtualBox would have to identify as VMware, see the function vmware_platform().

comment:7 Changed 20 months ago by rschmied

Well, agreed that VirtualBox should not identify itself as VMware (even though that would help with older kernels when having this feature would automatically do 'the right thing'). Also note that the comment explicitly mentions 'implementing compatibility modes for other hypervisors'. But in fact, there should be a 'virtualbox.c' with VirtualBox specific detection code / TSC frequency fetching code. Looking at  hypervisor.c VirtualBox is actually missing the party as of today...

  28/*
  29 * Hypervisor detect order.  This is specified explicitly here because
  30 * some hypervisors might implement compatibility modes for other
  31 * hypervisors and therefore need to be detected in specific sequence.
  32 */
  33 static const __initconst struct hypervisor_x86 * const hypervisors[] =
  34 {
  35 #ifdef CONFIG_XEN_PVHVM
  36        &x86_hyper_xen_hvm,
  37 #endif
  38        &x86_hyper_vmware,
  39        &x86_hyper_ms_hyperv,
  40 };
  41
  42 const struct hypervisor_x86 *x86_hyper;
  43 EXPORT_SYMBOL(x86_hyper);
  44
  45 static inline void __init
  46 detect_hypervisor_vendor(void)
  47 {
  48        const struct hypervisor_x86 *h, * const *p;
  49
  50        for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) {
  51                h = *p;
  52                if (h->detect()) {
  53                        x86_hyper = h;
  54                        printk(KERN_INFO "Hypervisor detected: %s\n", h->name);
  55                        break;
  56                }
  57        }
  58 }
  59
  60 void __cpuinit init_hypervisor(struct cpuinfo_x86 *c)
  61 {
  62        if (x86_hyper && x86_hyper->set_cpu_features)
  63                x86_hyper->set_cpu_features(c);
  64 }
  65
  66 void __init init_hypervisor_platform(void)
  67 {
  68
  69        detect_hypervisor_vendor();
  70
  71        if (!x86_hyper)
  72                return;
  73
  74        init_hypervisor(&boot_cpu_data);
  75
  76        if (x86_hyper->init_platform)
  77                x86_hyper->init_platform();
  78 }
  79

comment:8 Changed 6 months ago by Gabe

I still see this same issue and can reproduce it reliably on many of our dev laptops. @rschmied , did you ever figure a work around? I would love to be able to work around this without needing to change out .vbox base images.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use