VirtualBox

Opened 6 years ago

Closed 4 years ago

#17608 closed defect (invalid)

Resuming a VM misses a heartbeat and gets reset occasionally

Reported by: Jim Klimov Owned by:
Component: other Version: VirtualBox 5.1.30
Keywords: Cc:
Guest type: Solaris Host type: Windows

Description

I have a Windows 7 host running several Linux and Solaris VMs with Guest Additions under VirtualBox 5.1.30. When I roam between work and home, I run a script to "savestate" the virtual machines and put the laptop to sleep; when it gets powered back on the VMs are resumed and get new IP addresses etc.

All is well except that about once in a week I end up with a freshly booted work VM, where I previously had lots of stuff running and ready to continue work (IDEs, dev-testing terminal tabs, etc). So such reset is quite inconvenient and can lose me some recent work (e.g. shell histories in terminals) as well as it costs me some downtime to re-start everything I needed.

I recently inspected the VirtualBox logs to see if the saved-state file was corrupted or something, and found it is not. In fact, the VM started to resume and then was killed by heartbeat tracker with these messages in the middle of the log:

00:00:22.025091 Changing the VM state from 'LOADING' to 'SUSPENDED'
00:00:22.025181 Changing the VM state from 'SUSPENDED' to 'RESUMING'
00:00:22.026442 Changing the VM state from 'RESUMING' to 'RUNNING'
00:00:22.026458 Console: Machine state changed to 'Running'
00:00:22.449411 GUI: UIMachineViewNormal::adjustGuestScreenSize: Adjust guest-screen size if necessary.
00:00:22.449982 GUI: 2D video acceleration is disabled
00:00:22.449997 GUI: HID LEDs sync is enabled
00:00:22.450553 GUI: UIMachineLogicNormal::sltCheckForRequestedVisualStateType: Requested-state=0, Machine-state=5
00:00:22.539429 AHCI#0: Port 0 reset
00:00:22.697286 VUSB: Attached 'HidMouse' to port 1
00:00:26.473715 VMMDev: vmmDevHeartbeatFlatlinedTimer: Guest seems to be unresponsive. Last heartbeat received 4 seconds ago
00:02:16.362691 OHCI: USB Reset
00:02:16.362971 OHCI: Software reset
00:02:16.363298 EHCI: USB Suspended
00:02:16.381140 EHCI: Hardware reset
00:02:16.381341 ACPI: Reset initiated by ACPI
00:02:16.381460 Changing the VM state from 'RUNNING' to 'RESETTING'

I am rather certain the machine did not take so long before resetting - I see it booting about as soon as it is un-paused after the load.

My primary "gut guess" is it may be that the host clock is synced in the new network after waking up, and/or the VM clock is resynced to the host after unpause, causing a jump in readings and a false-positive missed heartbeat. Or since there are several VMs resuming, something is indeed lagging under the burst of load.

Either way, the solution would be to not trust the heartbeat for some time (per monotonous host kernel clock) after VM resume, and/or not until the VM clock resync has happened.

And also some way to rule out misfiring when host clock changes abruptly might help, though I haven't seen issues when the system is just very busy with some compilation (so mouse is lagging and at least the GUI clocks in host and VMs are not updated for several minutes in a row).

Change History (1)

comment:1 by aeichner, 4 years ago

Resolution: invalid
Status: newclosed

No activity for 2 years and no complete VBox.log, reopen if still an issue and attach a completel VBox.log.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use