VirtualBox

Ticket #10766 (closed defect: fixed)

Opened 21 months ago

Last modified 3 months ago

Guest real time clock drift by 5% in "Mac on Mac" setups using VirtualBox's EFI

Reported by: james.watkins Owned by:
Priority: major Component: EFI
Version: VirtualBox 4.1.18 Keywords:
Cc: Guest type: OSX Server
Host type: Mac OS X

Description

I have a Snow Leopard Server guest running on a Lion host.

I measure a constant real time drift of approximatively 5% (that is, time inside the virtual machine flows ~5% slower than time in the host). This time drift is not affected by the system work load (neither the guest nor the host), and my host CPU use constant TSC rate. Similarly, the CPU speed reported in "About this Mac" is 5% higher in guest (that is, in my setup, 2.52 GHz instead of 2.4GHz)

From searching on Google, it appears that other people have also reported this ~5% drift. It seems however that no one using an alternative EFI has mentioned this issue, suggesting that this defect is most certainly related to VirtualBox's EFI.

Setting VirtualBox property "VBoxInternal/TM/WarpDrivePercentage" to "104" or "105" helps reduce the drift rate, but both values still leave an error of more than ten seconds per hour; also, NTPD can't compensate such drift, since it exceed the 500 per million maximum slew rate, and running "NTP -q" from cron cause frequent important time shift that might impair service processes (after all, why would someone use Snow Leopard Server if it wasn't to run services?)

From my investigation, I gather that the XNU kernel (the kernel of Mac OS) evaluates real time by dividing the TSC register by a determined TSC frequency. That frequency is computed from the FSB Frequency, which is obtained from an EFI property node. VirtualBox estimate the FSB Frequency by taking the CPU speed (in MHz), converting it to Hz (by multiplying it twice by 1024) and then diving by 4 (now, don't ask about this constant, its seems just fine). But I think that herein lies the problem: hertz are not power of two, so going from giga-hertz to hertz should be done by multiplying twice by 1000, not 1024. So we'd get (1024*1024) / (1000*1000) = 104,8576 %, which would match the reported ~5% time drift, as well as the incorrectly reported CPU speed.

I have attached a patch file changing this formula. I have unfortunately been unable to confirm that this patch actually fix the time drift issue, due to numerous build issues and incompatibilities. I'm pretty confident that this patch should at least considerably reduce the drift, though, for some unexplained reason, on my host, the TSC frequency (which can be read in terminal by the command 'sysctl machdep.tsc.frequency') is 2394138834, instead of the 2400000000 that I would expect using the reported FSB frequency, so I'm afraid there could be a remaining 0,2448 % drift, which would still exceed NTP's maximum slew rate.

Attachments

DevEFI.patch Download (888 bytes) - added by james.watkins 21 months ago.
Changed TSC Frequency estimation formula

Change History

Changed 21 months ago by james.watkins

Changed TSC Frequency estimation formula

comment:1 Changed 21 months ago by Hachiman

Thank you for report and deep analysis, I'll look at this and back to you later.

comment:2 Changed 21 months ago by Schafroth

I am seeing big time screw as well in my Linux guests running on OS X on a mac mini. But would this bug and path only be relevant when running with EFI enabled?

Strangely enough I have two mac minis (different models). Both had this issue in the beginning but one is now nicely synced using NTP (with a -500.00 drift) . But I don't recall what I did so it synced.

comment:3 Changed 21 months ago by james.watkins

Schaforth,

1) Different operating systems have different strategies to implement their real time clock, and there has been various "real time clock" issues reported in VirtualBox over time, affecting one or several operating system. The FSB Frequency EFI property is specific to Mac OS (the code doing this can be seen there:  http://www.opensource.apple.com/source/xnu/xnu-1504.15.3/osfmk/i386/tsc.c). As far as I know, the other operating systems that relies on the TSC register use a timing loop (using another clock source) to count how much the TSC increase in a given time span. So yes, this bug would be specific to Mac OS guests.

By the way, the Linux kernel supports several real time clock strategies, which can be selected using the boot argument "clocksource"; see  http://redsymbol.net/linux_boot_parameters/ for the most common values for this argument. Setting this argument to a different value will most certainly fix the drift you describe on your Linux guest. Also, I think that installing VirtualBox Additions for Linux guest install a component that maintain synchronization of the guest's clock.

2) -500 and +500 are the minimum and maximum drift values supported by NTP; when NTP is reporting these drift values, it almost certainly means that NTP has detected a drift much more important, but has constrained its adjustment to +/-500 parts per million (that is a maximum adjustment of 43 seconds per day)...

Now, I'm not sure I understand what you are saying about those two Mac Minis... Are you saying that your _physical_ Mac Minis have that much drift? That would be alarming for physical machines to be so off (before NTPD adjustments). If you are instead saying that virtualboxes on these Mac Mini have important time drift, are they Mac guests or Linux guests?

comment:4 Changed 17 months ago by Schafroth

James,

Thanks for the in-depth explaination.

Sorry about not following up. For some reason the ntp clock became (somewhat) stable again. Until today, where it started to drift more that ntp can handle.

I have two macminis both running 10.8.2 and VB 4.2.4. One is a Intel Core 2 Duo (2010 model) and the other is a Sandy Bridge (2011) Both machines runs ntp client in hosts and is sync'ed with a euro.appl. ntp server.

The clock of debian guest on the Sandy Bridge machine is stable. It has a drift around -10.

The drift of debian guest on the C2D is 500, and seems to balance on the edge of sync or falls off.

Both debian guests was using tsc as time clock. I have now switch to acpi_pm option of C2D guest and that seems to stable clock.

Thanks for the pointers.

Version 0, edited 17 months ago by Schafroth (next)

comment:5 Changed 3 months ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Fixed in VBox 4.3.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use