VirtualBox

Ticket #7915 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

Mircosecond timer returns value from the past

Reported by: sethbollinger Owned by:
Priority: major Component: other
Version: VirtualBox 4.0.0 Keywords:
Cc: Guest type: other
Host type: other

Description

We're running a ubuntu 10.04 (2.6.32-27-generic) guest on a ubuntu 10.10 (2.6.35-22-server) server using VirtualBox 4.0 (or 3.2.8). If I enable 2 virtual CPUs on the guest, every once in a while the timer will tick backwards by ~4-60 microseconds. If I allocate 1 virtual CPU instead of 2, this problem seems to happen much less often. I have tried a bunch of different things having to do with timer migration, disabling tickless support, and specifying the timer to use on the kernel load line. The only thing that seems to work is using jiffies for the counter, which limits you to HZ resolution (which would be a good reason from a microsecond resolution problem to go away :)).

I've looked through the timer get code in VMM/VMMAll/TMAllCpu.cpp, but it looks like returning a TSC value in the past is protected against so I don't understand how it's happening. My only guess is that somehow the timer value is set to a time in the past so the last seen protection on the get isn't working, but I don't have any proof.

I will try to gather some data using system tap next week. If there's any other data I can gather, please let me know. I see that it looks like some stats are logged when these time conditions occur, but I don't know how to capture them. I would be happy to capture them if someone would point me to a webpage that explains how.

Here's python and C code to recreate the issue on the guest: ####################################### import time

print("Start time => %s" % time.ctime()) while True: t = time.time() c = time.time()

if c < t: print("ERROR:\nCURR => %f\nLAST => %f" % (c, t)) break

print("End time => %s" % time.ctime()) ####################################### #include <stdio.h> #include <sys/time.h>

int main() { struct timeval last, curr;

while(1) { gettimeofday(&last, 0); gettimeofday(&curr, 0);

if ((last.tv_sec == curr.tv_sec) && (curr.tv_usec < last.tv_usec)) { printf("ERROR:\nCURR => %d\nLAST => %d", curr.tv_usec, last.tv_usec); break; } } } #######################################

Attachments

VBox.log Download (41.8 KB) - added by sethbollinger 3 years ago.
VBox log

Change History

Changed 3 years ago by sethbollinger

VBox log

comment:1 in reply to: ↑ description ; follow-up: ↓ 2 Changed 3 years ago by mTi3Ad9vN

Replying to sethbollinger: It appears that gettimeofday is not guaranteed to return a nondecreasing result. For one thing, NTP resynchronization can make it go backward. See "gettimeofday() should never be used to measure time" at  http://blog.habets.pp.se/2010/09/gettimeofday-should-never-be-used-to-measure-time for more details and a workaround.

comment:2 in reply to: ↑ 1 Changed 3 years ago by sethbollinger

Replying to mTi3Ad9vN:

Replying to sethbollinger: It appears that gettimeofday is not guaranteed to return a nondecreasing result. For one thing, NTP resynchronization can make it go backward. See "gettimeofday() should never be used to measure time" at  http://blog.habets.pp.se/2010/09/gettimeofday-should-never-be-used-to-measure-time for more details and a workaround.

Thanks for the reply!

First thing I have disabled ntp and the guest additions so nothing should be trying to adjust the time in the guest.

I also assume that the MONOTONIC clock shouldn't tick backwards (the function listed in the web site you pasted in your reply).

I coded a quick test and it fails in the same way. Here's the code.

#include <stdio.h>
#include <time.h>

int main()
{
  struct timespec last;
  struct timespec cur;

  while(1)
  {
    clock_gettime(CLOCK_MONOTONIC, &last);
    clock_gettime(CLOCK_MONOTONIC, &cur);

    if ((last.tv_sec == cur.tv_sec) && (cur.tv_nsec < last.tv_nsec))
    {
      printf("ERROR:\nCURR => %d\nLAST => %d\nDIFF => %d\n", cur.tv_nsec, last.tv_nsec, (last.tv_nsec - cur.tv_nsec));
      break;
    }
  }
}

I haven't had the time to trace this down through the kernel, but my guess is it gets to pretty much the same place.

Here's what the failure looks like:

satest@guido2:~$ ./a.out 
ERROR:
CURR => 81080241
LAST => 87553997
DIFF => 6473756
satest@guido2:~$ ./a.out 
ERROR:
CURR => 75009855
LAST => 81499882
DIFF => 6490027
satest@guido2:~$ ./a.out 
ERROR:
CURR => 984995402
LAST => 991477962
DIFF => 6482560
satest@guido2:~$ ./a.out 
ERROR:
CURR => 41654433
LAST => 48140498
DIFF => 6486065
satest@guido2:~$ ./a.out 
ERROR:
CURR => 567593935
LAST => 573974849
DIFF => 6380914
satest@guido2:~$ ./a.out 
ERROR:
CURR => 903128008
LAST => 909618320
DIFF => 6490312

It's interesting to me that the difference always starts with 64, which happens to be the amount added in tmCpuTickGetInternal when a negative timer tick is detected. I pasted the code below.

/**
 * Read the current CPU timstamp counter.
 *
 * @returns Gets the CPU tsc.
 * @param   pVCpu       The VMCPU to operate on.
 */
DECLINLINE(uint64_t) tmCpuTickGetInternal(PVMCPU pVCpu, bool fCheckTimers)
{
    uint64_t u64;

    if (RT_LIKELY(pVCpu->tm.s.fTSCTicking))
    {
        PVM pVM = pVCpu->CTX_SUFF(pVM);
        if (pVM->tm.s.fTSCVirtualized)
        {
            if (pVM->tm.s.fTSCUseRealTSC)
                u64 = ASMReadTSC();
            else
                u64 = tmCpuTickGetRawVirtual(pVM, fCheckTimers);
            u64 -= pVCpu->tm.s.offTSCRawSrc;
        }
        else
            u64 = ASMReadTSC();

        /* Never return a value lower than what the guest has already seen. */
        if (u64 < pVCpu->tm.s.u64TSCLastSeen)
        {
            STAM_COUNTER_INC(&pVM->tm.s.StatTSCUnderflow);
            pVCpu->tm.s.u64TSCLastSeen += 64;   /* @todo choose a good increment here */
            u64 = pVCpu->tm.s.u64TSCLastSeen;
        }
    }
    else
        u64 = pVCpu->tm.s.u64TSC;
    return u64;
}

Seth

comment:3 follow-up: ↓ 4 Changed 3 years ago by mTi3Ad9vN

This bug seems not uncommon on non-virtualized platforms -- see, e.g.,  http://stackoverflow.com/questions/3657289/linux-clock-gettimeclock-monotonic-strange-non-monotonic-behavior (Ubuntu 10.04, no virtualization);  http://lists.debian.org/debian-glibc/2010/08/msg00079.html (Debian, virtualization state not specified) -- so it's not clear whether VBox is the problem.

comment:4 in reply to: ↑ 3 Changed 3 years ago by sethbollinger

Replying to mTi3Ad9vN:

This bug seems not uncommon on non-virtualized platforms -- see, e.g.,  http://stackoverflow.com/questions/3657289/linux-clock-gettimeclock-monotonic-strange-non-monotonic-behavior (Ubuntu 10.04, no virtualization);  http://lists.debian.org/debian-glibc/2010/08/msg00079.html (Debian, virtualization state not specified) -- so it's not clear whether VBox is the problem.

I've traced down through the system call to the point where the kernel tries to read the hardware timer register. I didn't see anything strange going on there. Then I traced that into the virtual box code where they virtualize those registers (the acpi PM timer and the TSC). While I can't say with 100% certainty that it's not a guest kernel bug, I have done some research and it seems like virtual box is handing us a value in the past. I guess this could be caused by a host kernel bug, but it's still manifesting itself through virtual box.

Also, when I run the test on the host hardware, it works fine.

Seth

comment:5 follow-ups: ↓ 6 ↓ 7 Changed 3 years ago by mTi3Ad9vN

A few suggestions:

  1. The function tmCpuTickGetInternal you quoted updates u64TSCLastSeen only when current TSC < u64TSCLastSeen. But that's incorrect; that function (or some other function in the same handler chain) needs to update u64TSCLastSeen every time TSC is read. I don't have time to check the rest of the handler chain, but if you do, this might help you find your problem's cause.
  1. This handler chain might run into trouble if the host OS switches its thread between physical cores/CPUs (which can have different TSC register contents) while it's executing. Also, if this handler chain is multithreaded (is it?) it must synchronize access to u64TSCLastSeen, etc., with some kind of mutual exclusion primitives (e.g., spinlock, mutex).

comment:6 in reply to: ↑ 5 Changed 3 years ago by sethbollinger

I finally had a chance to do some building and rebuilding and here are my results. I found that I was able to recreate the issue on my freebsd 8.1 box at home running virtual box 3.2.12. So, I think we can put to rest that this is a host based bug. It would be strange to have the exact same bug manifest on 2 entirely different operating systems.

I altered the function you suggested in item 1 above to set last seen each time the guest requests time. This in itself didn't fix the issue.

If I enabled VBoxInternal/TM/UseRealTSC and disabled VBoxInternal/TM/TSCVirtualized, those coupled with my change seem to make it work. However, I don't believe it's fixed, I believe it simply shrank the race window to such a small period that I'm not able to get the bug to manifest anymore.

If I comment out the change you suggested in item 1 above, the problem will occur again. I definitely think it's a race somewhere.

I will attempt the fix on the IBM system at work and I hope that the problem will go away.

I don't plan on hunting down the race as this is more than I wanted to chew. I'll leave that to someone who knows the code better than I. :)

I believe it should be easy to reproduce for anyone with one of the latest intel core processors. I'm running a core i7 920.

comment:7 in reply to: ↑ 5 Changed 3 years ago by hwti

Replying to mTi3Ad9vN:

A few suggestions:

  1. The function tmCpuTickGetInternal you quoted updates u64TSCLastSeen only when current TSC < u64TSCLastSeen. But that's incorrect; that function (or some other function in the same handler chain) needs to update u64TSCLastSeen every time TSC is read. I don't have time to check the rest of the handler chain, but if you do, this might help you find your problem's cause.

The code handling TSC is complex, but :

  • tmCpuTickGetInternal is only called when RDTSC is emulated (VMX_VMCS_CTRL_PROC_EXEC_CONTROLS_RDTSC_EXIT bit set)
  • u64TSCLastSeen is only updated by TMCpuTickSetLastSeen (end of guest execution when RDTSC is handled by VT-x) or TMCpuTickSet (write to TSC MSR)

=> the u64TSCLastSeen test is only for the VT-x -> emulated transition case (if real TSC runs faster than tmCpuTickGetRawVirtual)
It shouldn't be a problem, unless tmCpuTickGetInternal doesn't always increase.

  1. This handler chain might run into trouble if the host OS switches its thread between physical cores/CPUs (which can have different TSC register contents) while it's executing. Also, if this handler chain is multithreaded (is it?) it must synchronize access to u64TSCLastSeen, etc., with some kind of mutual exclusion primitives (e.g., spinlock, mutex).

Given the uses of ASMReadTSC() in VT-x handling code, I assume the host can't switch CPU in the middle of one VT-x execution. Else it would be completely wrong for CPU with TSC different between cores. As my Linux host doesn't report TSC differences between cores, I think it's the same on all.

I modified the test code, using CLOCK_MONOTONIC_RAW, and adding sched_setaffinity to make it working on only one core. It runs correctly on Linux host, but still fails on guest.

It really strange to have guest Linux report delta between TSC of different cores when it's not the case on the host.

comment:8 Changed 3 years ago by hwti

Fixed in 4.0.10 for me

comment:9 Changed 3 years ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Thanks. Please reopen if still relevant.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use