[vbox-dev] vgdrvIoCtl_CancelAllWaitEvents inconsistent behavior ?

Hans de Goede hdegoede at redhat.com
Wed Jul 5 13:55:55 UTC 2017


On 05-07-17 12:14, Michael Thayer wrote:
> Hello Hans,
> 05.07.2017 11:18, Hans de Goede wrote:
> [Discussion of inconsistent wait event cancelling code in VBoxGuest.]
>> I've been looking at the userspace code myself and this bit:
>> src/VBox/Additions/x11/VBoxClient/seamless.cpp:
>> int SeamlessMain::selfTest()
>> {
>>      int rc = VERR_INTERNAL_ERROR;
>>      const char *pcszStage;
>>      LogRelFlowFunc(("\n"));
>>      do {
>>          pcszStage = "Testing event loop cancellation";
>>          VbglR3InterruptEventWaits();
>>          if (RT_FAILURE(VbglR3WaitEvent(VMMDEV_EVENT_VALID_EVENT_MASK, 0,
>> NULL)))
>>              break;
>>          if (   VbglR3WaitEvent(VMMDEV_EVENT_VALID_EVENT_MASK, 0, NULL)
>>              != VERR_TIMEOUT)
>>              break;
>>          rc = VINF_SUCCESS;
>>      } while(0);
>>      if (RT_FAILURE(rc))
>>          LogRel(("VBoxClient (seamless): self test failed.  Stage:
>> \"%s\"\n",
>>                  pcszStage));
>>      return rc;
>> }
>> Relies on the current behavior that a cancel while there are no waiters
>> results in the next wait succeeding (without reporting any events), so
>> I will mimick that behavior when moving this code over to linux
>> waitqueues.
> Actually it seems to me that it relies on the fact that the 0ms time out
> is checked before the interruption.

No, it does a cancel while there are no waiters, then a VbglR3WaitEvent
which it expects to succeed, not timeout. Which matches the kernel code
which does an initial check for pSession->fPendingCancelWaitEvents
before checking the timeout.

The kernel code does:

1) check for pSession->fPendingCancelWaitEvents (and for pending events)
2) check for timeout 0, return VERR_TIMEOUT if timeout == 0
3) Alloc WaitList item + add to list
4) Recheck for pSession->fPendingCancelWaitEvents (and for pending events)
5) Wait

And the selftest relies on 1 and 2 basically.

> At some point we (I) used
> VbglR3InterruptEventWaits() in the seamless code, but I removed it when
> I cleaned up the code as it is not nice.  I realised while searching for
> uses of INTERRUPT_WAITEVENTS that I had forgotten to remove this pretty
> pointless test.  I plan to deprecate calling WAITEVENT in a session
> after INTERRUPT_WAITEVENTS has been called to make the matter moot and
> to change WAITEVENT to enforce this.
> Which brings us back to the question of ABI stability.  If something
> like this were to pop up after you had released your code, I would
> expect to handle it by announcing that it is deprecated and changing the
> ABI later, say in the next major release.  Does that seem reasonable to
> you, or too short?

Since we're talking kernel ABI here now, you would need to align it
to the kernel cycle, not vbox major releases. I would expect a deprecation
like this to take at least a year, so put a note on the deprecation in:
linux/Documentation/ABI/obsolete/ioctl-vboxguest and add a deprecation date,
then once that date is past you can merge a patch upstream removing the
backward compat stuff / enforcing new behavior and that will then get
merged and released in the next kernel cycle, assuming there are no




I've finished converting the event WaitList to a standard Linux
wait_queue, the cleanup is not that big (yet) because I still need
to convert the HGCM WaitList also, after that all the WaitList
code can be removed which should be a nice cleanup.

If you want to check my work see:

Note it is probably easier to use the diff to see where the changes are
and then just look at the new file, as I've basically rewritten a couple
of functions.

More information about the vbox-dev mailing list