Ticket #10766 (closed defect: fixed)
Guest real time clock drift by 5% in "Mac on Mac" setups using VirtualBox's EFI
|Reported by:||james.watkins||Owned by:|
|Cc:||Guest type:||OSX Server|
|Host type:||Mac OS X|
I have a Snow Leopard Server guest running on a Lion host.
I measure a constant real time drift of approximatively 5% (that is, time inside the virtual machine flows ~5% slower than time in the host). This time drift is not affected by the system work load (neither the guest nor the host), and my host CPU use constant TSC rate. Similarly, the CPU speed reported in "About this Mac" is 5% higher in guest (that is, in my setup, 2.52 GHz instead of 2.4GHz)
From searching on Google, it appears that other people have also reported this ~5% drift. It seems however that no one using an alternative EFI has mentioned this issue, suggesting that this defect is most certainly related to VirtualBox's EFI.
Setting VirtualBox property "VBoxInternal/TM/WarpDrivePercentage" to "104" or "105" helps reduce the drift rate, but both values still leave an error of more than ten seconds per hour; also, NTPD can't compensate such drift, since it exceed the 500 per million maximum slew rate, and running "NTP -q" from cron cause frequent important time shift that might impair service processes (after all, why would someone use Snow Leopard Server if it wasn't to run services?)
From my investigation, I gather that the XNU kernel (the kernel of Mac OS) evaluates real time by dividing the TSC register by a determined TSC frequency. That frequency is computed from the FSB Frequency, which is obtained from an EFI property node. VirtualBox estimate the FSB Frequency by taking the CPU speed (in MHz), converting it to Hz (by multiplying it twice by 1024) and then diving by 4 (now, don't ask about this constant, its seems just fine). But I think that herein lies the problem: hertz are not power of two, so going from giga-hertz to hertz should be done by multiplying twice by 1000, not 1024. So we'd get (1024*1024) / (1000*1000) = 104,8576 %, which would match the reported ~5% time drift, as well as the incorrectly reported CPU speed.
I have attached a patch file changing this formula. I have unfortunately been unable to confirm that this patch actually fix the time drift issue, due to numerous build issues and incompatibilities. I'm pretty confident that this patch should at least considerably reduce the drift, though, for some unexplained reason, on my host, the TSC frequency (which can be read in terminal by the command 'sysctl machdep.tsc.frequency') is 2394138834, instead of the 2400000000 that I would expect using the reported FSB frequency, so I'm afraid there could be a remaining 0,2448 % drift, which would still exceed NTP's maximum slew rate.