VirtualBox

Opened 7 years ago

Closed 3 years ago

#16562 closed defect (obsolete)

mouse locks up in (some) VMs after some uptime when using vbox after 5.1.10

Reported by: Harry M Owned by:
Component: other Version: VirtualBox 5.1.16
Keywords: Cc:
Guest type: Linux Host type: Linux

Description

Using 5.1.12 or later on a CentOS 6.8 host, certain VMs experience a loss of response to mouse clicks (both left and right) after some amount of uptime (usually a day or so). This was also observed on 5.0.30 and after. MI is enabled, and ALL VMs are configured to use tablet device.

The guest most likely to have the problem is an ArchLinux variant called obarun (obarun.org for more info).

GAs are up-to-date on all guests. Guests are generally kept up to date on system software as well.

Workaround has been to downgrade to 5.1.10 (or 5.0.28) where the problem is not experienced.

Attachments (4)

Devuan-2017-03-29-10-40-28.log (136.7 KB ) - added by Harry M 7 years ago.
Devuan-2017-04-02-19-56-38.log (85.6 KB ) - added by Harry M 7 years ago.
PCLinuxOS-2017-04-08-02-50-40.log (271.8 KB ) - added by Harry M 7 years ago.
Ubuntu-2017-04-12-05-07-46.log.bz2 (116.8 KB ) - added by Harry M 7 years ago.

Download all attachments as: .zip

Change History (55)

comment:1 by Harry M, 7 years ago

Note that mouse tracking works, and the keyboard continues to work. In fact, alt-tab to a command shell allows normal interaction, and am able to kill the offending VM (obarun 99% of the time is the one having the problem, but I've seen it when using Ubuntu also).

comment:2 by Michael Thayer, 7 years ago

To confirm: when the mouse pointer is locked up, you are unable to click on any host application until the VM process is stopped, but the pointer changes shape as normal when you move it?

Could you please attach a log file from a VM where this occurs? And did you try disabling keyboard capturing as discussed on IRC? And are you able to reproduce this without Additions installed?

comment:3 by Harry M, 7 years ago

Just got notification of the update and upgraded to 5.1.16.

Unchecking the keyboard capture option seems to be global, across all VMs and the GUI. I could not figure out how to disable it only on the one problem VM.

Keep in mind also that I generally run my VMs in full screen mode, and that's the situation when the bug occurs. Even if I switch to windowed mode, the same behavior ensues. Now that I have disabled keyboard capture, I will try switching to windowed mode when the bug arises.

One other thing: As far as mouse pointer lock up, I cannot claim that I've seen the cursor change while in this state (but I will note it more carefully the next time it occurs). The mouse pointer does continue to track correctly, however. It's just that there is no reaction to mouse clicks, left or right.

comment:4 by Michael Thayer, 7 years ago

That is right, this can only be disabled globally.

comment:5 by Harry M, 7 years ago

Had to reboot the VM due to some other work I'm doing on the host (switching from 32 to 64 bit CentOS, but for reasons not related to this bug). I hope that does not soil this test too much; I can always retreat back to 32 bit as it still lives in another partition.

comment:6 by Harry M, 7 years ago

Uptime is over 2 days, and no sign of the mouse issue. But it could take more time, still. I need to reboot soon, for some configuration reasons, so I may wait another day and then reboot. It usually happened within a day or so.

comment:7 by Harry M, 7 years ago

It's been 4+ days... I'm wondering if I should try re-enabling the keyboard capture to see if that coaxes the bug back to me?

comment:8 by Michael Thayer, 7 years ago

That is actually what I expected (since I wrote the capturing code): that disabling keyboard capturing would make the problem go away. I would also expect it to be possible to reproduce it using manual capturing (i.e. using the host key without enabling auto-capture globally). But of course it might be harder to reproduce it that way.

comment:9 by Harry M, 7 years ago

With the system still keyboard capture disabled, I just used HOST+F to switch back to windowed mode and was able to switch between the VM window and the host windows just by moving the mouse to gain focus. All I had to do is click somewhere in a window, and I didn't need to use the HOST key at all. I've switched it back to full screen using HOST+F again.

Is this expected behavior when keyboard capture is disabled? And I don't understand why keyboard capture option has this effect on the mouse behavior I reported.

comment:10 by Michael Thayer, 7 years ago

When we capture the keyboard we also capture mouse clicks. This is because of the way GNOME and various window managers behave: they expect to be able to capture the keyboard whenever the user clicks in a menu or on a window title bar, so we have to be able to release the capture when the user clicks outside of our area before the capture attempt happens.

When automatic capture is disabled, we do not try to capture the keyboard (and by extension mouse clicks) when we get the focus. This is the intention of the feature. Perhaps I misunderstood your question, or was not clear myself previously?

comment:11 by Harry M, 7 years ago

I suspected that might be the reason, but why isn't the feature called something like "mouse and keyboard capture" then? The name is what threw me, but then again, computer software nomenclature rarely seems to make a whole lot of sense to me. But, moving on here...

I rebooted late yesterday, and then 2 more times, thanks to a power outage in our area. Since then, the system has been up and I've not noticed any errant behavior. However, do you want me to experiment with re-enabling what VB is calling "keyboard capture" just to try to confirm that that is, in fact, the source of the issue I was having?

Otherwise, I am guessing we can say you have nailed it (thanks). Does this call for a code enhancement/bug fix, or is this just operator configuration? I never had this problem prior to 5.1.12, but maybe it defaulted the option to disabled back then. I suppose I am asking if this is truly a bug.

comment:12 by Michael Thayer, 7 years ago

The answer to your naming question is that the mouse button capture (actually an X11 grab) is not supposed to be visible to the user. We do not do it at all on Windows and Mac systems, and we only do it on X11 systems to work around other applications which expect to be able to grab the keyboard whenever they are clicked on - by intercepting the click we can release our own keyboard grab before they see it.

From my point of view this is not nailed, as I do not know why you are reporting this but other people are not. It is most likely (though not certainly) something to do with your CentOS set-up. Of course, if you say you would rather disable automatic keyboard capture than find out the reason then we will leave things as they are until someone else encounters this. Otherwise it would be interesting to see if you can better isolate what causes this on your system.

comment:13 by Harry M, 7 years ago

When you say it has something to do with my CentOS setup, then why is it that it has been one VM that has the problem? CentOS is my host OS, and the VM in question is an Arch distro.

When I said nailed, I only meant that you have pinpointed the feature that seems to be causing the problem for the one guest VM. I did not mean to imply that the problem was solved. I asked you if you want me to try re-enabling keyboard capture, and I'd be happy to do so if you feel it would reveal more information.

comment:14 by Michael Thayer, 7 years ago

That sounds like a good plan then. My guess that it has something to do with your host system is due to the fact that this is code which executes on the host and should be the same for all guests. It doesn't seem unreasonable to assume that something in the guest system might also (or instead) be related.

comment:15 by Michael Thayer, 7 years ago

I am actually slightly surprised that 5.0.30 is affected, since that version has completely different code in that area, and should not have any changes in the relevant code compared to 5.0.28 at all.

comment:16 by Michael Thayer, 7 years ago

Could you also please attach a log file of a virtual machine for your host system? I wonder whether there could be any changes to the screen configuration when the problem occurs?

comment:17 by Michael Thayer, 7 years ago

And please give test build revision 114230<1> a try. I tried fixing one place in the code where I thought this might happen. It would not explain the problem with 5.0.30 though. For now I will remove the change again in future test builds, so please make sure you use exactly that revision (and download the test build reasonably soon before it is replaced).

in reply to:  18 comment:19 by Harry M, 7 years ago

Replying to michael:

  1. https://www.virtualbox.org/wiki/Testbuilds

Downloaded all the files needed many hours ago, as soon as I read your last message. I have the host software, GA, and extensions from that location.

I have not installed them yet, however, as I am still testing with the capture option enabled on 5.1.18

comment:20 by Michael Thayer, 7 years ago

Excellent. Once you have reproduced it again with 5.1.18, if my guess is right then you should not be able to reproduce the problem with capture enabled on the test build. However, it is still a long shot that my guess is right.

And please, the log file!

comment:21 by Harry M, 7 years ago

I just noticed that I might not have the right versions of the test build. It appears I have version 114236 of the main package and GA, and version 114149 of extensions, the currently posted version in test. I should have checked more carefully at the time I downloaded them, which was apparently on or about Mar 28 12:37pm PT.

I also note that the current test build seems to have a different versions of extensions for 5.1, but the same versions for 5.0. I tend to notice stuff like this when checking things out more carefully.

At any rate, I am still waiting for the defect running the latest production versions of 5.1, as before, with the capture feature enabled.

Last edited 7 years ago by Harry M (previous) (diff)

comment:22 by Michael Thayer, 7 years ago

No worries, I was slow at removing the change again: all revisions up to 114267 (excluded) are good. The main package is the important one here.

by Harry M, 7 years ago

comment:23 by Harry M, 7 years ago

Finally, it broke. Attached is the log. Note that this time, the fault was in a different VM, so you can ignore my previous comments about the problem being unique to a particular VM; I thought I had seen a problem in a different VM previously but was not 100% certain, and now I am.

I will wait for you to instruct me further, pending your analysis of the log file.

Last edited 7 years ago by Harry M (previous) (diff)

comment:24 by Michael Thayer, 7 years ago

The log is from 5.1.18. Have you tried this out with the test build? The test builds page<1> still has builds with my change, but in case someone updates it, this<2> is the link to the 64-bit Enterprise Linux 6 build.

  1. https://www.virtualbox.org/wiki/Testbuilds
  2. https://www.virtualbox.org/download/testcase/VirtualBox-5.1-5.1.19_114236_el6-1.x86_64.rpm

comment:25 by Harry M, 7 years ago

Now running 5.1.19 (114236) with keyboard capture enabled.

by Harry M, 7 years ago

comment:26 by Harry M, 7 years ago

So, it finally happened.

I was working in (yet) another VM when the mouse suddenly locked up. I was able to use alt-tab to select a command shell in the guest. I tried running a ps to find the firefox session I had been running and using a moment before and ps reported nothing about it. Then when I tried alt-tab again, I was getting the host windows, not the guest windows, so I selected a command shell on the host (the mouse was not working on the host, either), and requested to kill the running VM. Once the VM was dead, mouse worked normally again.

Attached is the log of the aborted session. Of course, this was with the keyboard capture enabled this time.

Last edited 7 years ago by Harry M (previous) (diff)

comment:27 by Michael Thayer, 7 years ago

I made another small change and added some logging. Could you please reproduce with this build and attach the log file?

https://www.virtualbox.org/download/testcase/VirtualBox-5.1-5.1.19_114449_el6-1.x86_64.rpm

comment:28 by Harry M, 7 years ago

Running 114449 with keyboard capture enabled. Will update if/when same lockup occurs.

comment:29 by Harry M, 7 years ago

Got a mouse lockup; accidentally hit host-h, but that did the trick (I meant to do host-home to get the menu so I could check things out a bit first, but oh well) and brought the VM down cleanly this time. Log is attached.

by Harry M, 7 years ago

comment:30 by Michael Thayer, 7 years ago

Thank you, I will take a look at that on Monday. Since you have been looking at this for a while, do you have a feeling yet about how to reproduce this reliably?

comment:31 by Harry M, 7 years ago

Other than enabling the capture feature and letting the VMs run... no. What I mean is that I do not notice any particular program or event or such that seems to be triggering the problem.

Sorry, I really wish I could be much, much more helpful this way.

Last edited 7 years ago by Harry M (previous) (diff)

comment:32 by Michael Thayer, 7 years ago

I'm afraid this may get drawn out, but here is another one. The logging I added to the last build gave me more of a clue as to what was going wrong; in this one I added logging in even more paths.

https://www.virtualbox.org/download/testcase/VirtualBox-5.1-5.1.19_114492_el6-1.x86_64.rpm

comment:33 by Harry M, 7 years ago

Now running 114492 with capture keyboard enabled. Note that the extensions are from 114149.

Will post a log upon next failure.

comment:34 by Michael Thayer, 7 years ago

That is great. The exact version of the Guest Additions and the Extension Pack is not important for this.

by Harry M, 7 years ago

comment:35 by Harry M, 7 years ago

Here is the log for the mouse lockup on an Ubuntu VM after failure. Sorry, I forgot and killed the VM hard this time rather than finding a command line in the VM and shutting down gracefully.

The log was over the upload limit for this site, so it is bzip'd with -9.

comment:36 by Michael Thayer, 7 years ago

Thank you. I was able to see at least one thing which looked different close to the end of the log file. I presume that you terminated the machine very shortly after the problem happened. I will be away for a week and a half, so will not provide more feedback. Would you be able though to provide a couple more logs for comparison? Terminating the machine as you did as soon as possible after you notice the problem is actually helpful for looking at the log file.

comment:37 by Michael Thayer, 7 years ago

Oh dear. I have just hit this myself. Except that it was not in a virtual machine, but in Unity on the host system. The mouse buttons were locked until I killed compiz. Looks like this is a bug in X.Org, not in VirtualBox.

Last edited 7 years ago by Michael Thayer (previous) (diff)

in reply to:  36 comment:38 by Harry M, 7 years ago

I can keep running with the kb capture enabled and try to grab more logs -- how many more? So I should continue to bring the VM down hard then?

comment:39 by Michael Thayer, 7 years ago

Might be worth opening a bug on freedesktop.org, or at least asking about it on an X.Org mailing list.

comment:40 by Harry M, 7 years ago

So this is not a VBox bug then?

Is it only compiz, or could other software be affected also, being that it is an x.org bug? Given the variety of desktops and distros I am running in VMs, I'm wondering if it might have additional implications.

I'm also wondering, if only some VMs are running compiz (host here is not), and killing the VM releases the keyboard again, then I'm wondering how this can be the sole problem. But I'm not very familiar with x.org's internals.

I've just now checked to see if compiz is running anywhere, and it is not running on my host OR any of the VMs that I typically have running. Is this relevant?

Last edited 7 years ago by Harry M (previous) (diff)

comment:41 by Michael Thayer, 7 years ago

I am certainly drawing a - possibly incorrect - conclusion here, but given how complicated the grabbing code in the X server is I think it is a realistic one. In theory I would be qualified to debug the X server code, but it would simply require time I should not be spending on that. My assumption that sometimes releasing an X button grab can fail, so that the client which requested the grab will hold it until it is terminated. This could happen to Compiz or to VirtualBox, which both execute button grabs, but this would be completely independent, and could happen to Compiz without VirtualBox running and vice-versa. I have seen reports of similar bugs in the X server<1><2>, one fixed one not.

  1. https://bugs.freedesktop.org/show_bug.cgi?id=26213
  2. https://bugs.freedesktop.org/show_bug.cgi?id=59100
Last edited 7 years ago by Michael Thayer (previous) (diff)

comment:42 by Harry M, 7 years ago

Whether this is a bug in X.org or not may or may not be relevant. This problem started only fairly recently; as I said in my initial report, this behavior was never a problem prior to about 5.1.12 or 5.0.30. That is, I never had this under ANY 4.x version or prior.

I realize this could be a recently introduced bug in X.org, so it might seem coincidental. But then why is it that if I roll back to 5.1.10 (or, 5.0.28) or earlier, that I no longer experience this problem?

I've even tried upgrading to a 64bit version of the host (CentOS 6) and the problem persists. So while I am not stating that your conclusion is wrong, really, it's just that it does not explain all the other evidence (as I've restated here in comment 42).

Or maybe it does, and I'm just not reasoning clearly on this.

comment:43 by Michael Thayer, 7 years ago

Sorry for the slow answer, I was on holiday. My guess is that some change in VirtualBox might have triggered it. I asked one of the X.Org developers for ideas and this is part of the answer. You might in particular try the grab log suggestion.

yeah, sounds like a stuck grab, but those are notoriously hard to debug. You can write a very simple test program that tries to grab the mouse, if that fails that would confirm the grab at least. Here's one from several years ago, you should be able to modify it if needed:

https://people.freedesktop.org/~whot/grabtest.c

The next step would be to trigger the grab debugging in X and look at the logs: setxkbmap -option "grab:debug" and when it's stuck, hit ctrl+alt+F11 and look at the xorg.log, that lists all current grabs, including passive ones. Should be possible on anything reasonably new (3 years or so...)

That dump should get you a step closer to verifying the grab and who has it. it won't tell you *why* the grab is stuck though, but debugging that is fickle, the cause can be virtually anything, either the client not sending an ungrab request or the server ungrabbing the wrong device. And if the client doesn't send it, it could be caused by a broken enter/leave event sequence, etc... not easy to figure out without a reliable reproducer, sorry.

comment:44 by Michael Thayer, 7 years ago

And since I am still very surprised that 5.0.30 is affected, but not 5.0.28, logs with that would be interesting too.

comment:45 by halfer, 7 years ago

Possibly replicated by #16788 (contains another set of logs). Thanks to HarryM for the tracing work on it so far.

Michael, re the debugging, did you mean just looking at /var/log/Xorg.0.log in the guest, after a drag freeze? I've looked at mine now, and it is extremely sparse. However, I have not had this behaviour for a day or two, so I will have to wait for it to reoccur.

Out of interest, when you say you think this could be an X.org issue, do you mean in the guest or the host? I would imagine it cannot be the guest, since "sudo shutdown -r now" does not result in VirtualBox letting go of the captured mouse (see the other ticket).

Last edited 7 years ago by halfer (previous) (diff)

comment:46 by Michael Thayer, 7 years ago

I did indeed mean on the host. The debugging instructions I meant were the ones quoted in comment:43 - including building and using the test programme if you are comfortable doing that and following the instructions to generate the extra logging in Xorg.0.log. It might be worth opening a bug against the X server on freedesktop.org after getting the information. You can say that it happened using VirtualBox, and that I suspect it might be an X server bug because I observed the same thing using Ubuntu's Unity without VirtualBox even running. Let me know if you do open a bug there, because three-way communication may be helpful for resolving this.

comment:47 by Harry M, 7 years ago

I'm confused by halfer's comment because I find that shutting down the offending VM actually DOES release the captured mouse. In fact, it is often the ONLY way to do so. Your experience must be different than mine for some reason.

I'd like to add a comment that I have been running VirtualBox with the mouse capture disabled. I find that it has no real impact on my work, except for a couple of VMs I run in a window from the host desktop; the others run full screen.

Last edited 7 years ago by Harry M (previous) (diff)

in reply to:  47 comment:48 by halfer, 7 years ago

Thanks both.

Replying to HarryM:

I'm confused by halfer's comment because I find that shutting down the offending VM actually DOES release the captured mouse. In fact, it is often the ONLY way to do so. Your experience must be different than mine for some reason.

Let me clarify. Shutting down with "sudo shutdown now" does also release the captured mouse for me. That allows me to double-click on the VM in the Manager GUI to manually start it again.

However, I thought it useful to mention that the addition of the "-r" flag (i.e. shutdown and immediately restart) results in the VM rebooting but the mouse does not get released. That behaviour is not sporadic. I thought this observation might be useful in the debugging process.

Replying to michael:

if you are comfortable doing that and following the instructions to generate the extra logging in Xorg.0.log. It might be worth opening a bug against the X server on freedesktop.org after getting the information.

OK, thanks. I should be able to do that, yes. I am reticent to try developer builds of VirtualBox since they do not have SHA256 hashes that I can verify, but a C program should be OK.

comment:49 by Harry M, 7 years ago

To halfer: Sorry, I overlooked the "-r" portion of your report. I now concur, as I expect that would not work, as only completely stopping the running session releases the mouse it seems.

The same behavior applies to messed-up sound -- if the sound problem is actually caused by a problem on the host, then complete shutdown is the only way I have found to kind of reset the sound in the guest. That is, all resources are finally released when the VM stops completely. VBox resets and reboots don't actually "stop" the VM, per se. This is analogous to hardware (sometimes, e.g., only a hard stop will allow a messed up BIOS to reset itself, if it can be reset at all).

comment:50 by Harry M, 4 years ago

I have not observed any resurgence of this problem. I am now running on a Devuan Ascii host with the Vbox 6.1 host (and guest) software. I am still running a variety of guests, but they are different now, 3 years later. I have several Devuan Ascii (and now, Beowulf) VMs, as well as several Adelie Linux VMs (I understand I run these at my own risk, although I hope that VirtualBox will take up support for musl platforms soon).

Thank you for your incredible patience throughout the work on this bug. I would say it is probably safe to close at this point. You are always welcome to request logs if you have any lingering suspicions about this issue.

comment:51 by Dsen, 3 years ago

Resolution: obsolete
Status: newclosed
Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use