[vbox-dev] vgdrvIoCtl_CancelAllWaitEvents inconsistent behavior ?

Michael Thayer michael.thayer at oracle.com
Tue Jul 4 14:57:08 GMT 2017


Hello Hans,

04.07.2017 14:24, Hans de Goede wrote:
> Hi,
> 
> On 04-07-17 11:46, Michael Thayer wrote:
>> Hello Hans,
>>
>> 04.07.2017 08:55, Hans de Goede wrote:
>> [Discussion of inconsistent wait event cancelling code in VBoxGuest.]
>>> So some remarks about this, I was thinking that *maybe* the old
>>> behavior is on purpose, imagine the following:
>>>
>>> 1) cancel, no waiters, pSession->fPendingCancelWaitEvents get set
>>> 2) the thread for which the cancel was intended stops, without ever
>>>     calling into vgdrvIoCtl_WaitEvent again and so does not clear
>>>     the fPendingCancelWaitEvents flag
>>> 3) (much) later a new thread in the same session starts waiting
>>>
>>> old behavior: new thread sees an empty event, goes back to
>>> waiting
>>>
>>> new behavior: new thread sees a VERR_INTERRUPTED -> and stops
>>> ? (note I've not looked at the userspace code).
>>
>> I think you are reading too much intelligent design into the interface.
>> I was the guilty person who created it, long ago, and I don't think I
>> thought it through sufficiently well at the time.  That said, I don't
>> think I wrote the current implementation code.  I looked at the
>> user-space code I mentioned again, and actually realised that it does
>> handle the old (and the new) code.  Testing seems to confirm this.
> 
> Ok, so you're saying that the best thing todo for a rewrite is for
> all waiters to always exit with VBOXGUEST_WAITEVENT_INTERRUPTED ? On
> vgdrvIoCtl_CancelAllWaitEvents  ?
>
> Is the fPendingCancelWaitEvents flag really still necessary if there
> are no waiters ? And won't this cause issues for new waiters? Or will
> new waiters always have a new session ?
[...]

We talked about this a bit over lunch and decided that CANCEL_WAITEVENT
is too clumsy an interface to be used for anything other than session
termination.  The caller should set a flag first, and no further calls
to WAITEVENT should be be made after that.  That makes the problems you
brought up less relevant, and also means that you could in theory make
the changes you said and still have things work.

Of course, in practice there is always the risk of accidental
dependencies on current behaviour in all parts of the kernel-user
interface, but in the end it was your choice to try to duplicate an
interface created by several people independently, often under time
pressure, which has evolved more or less in parallel to the matching
user-space and host parts.  I would still recommend using our code as it
is in Fedora without submitting it upstream, and investing any
additional time you have for the task in helping improve our code.

Regards
Michael
-- 
Michael Thayer | VirtualBox engineer
ORACLE Deutschland B.V. & Co. KG | Werkstr. 24 | D-71384 Weinstadt

ORACLE Deutschland B.V. & Co. KG
Hauptverwaltung: Riesstraße 25, D-80992 München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V.
Hertogswetering 163/167, 3543 AS Utrecht, Niederlande Handelsregister
der Handelskammer Midden-Nederland, Nr. 30143697
Geschäftsführer: Alexander van der Ven, Jan Schultheiss, Val Maher



More information about the vbox-dev mailing list