VirtualBox

Opened 2 years ago

Closed 22 months ago

#20763 closed defect (fixed)

watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]

Reported by: AA Owned by:
Component: other Version: VirtualBox 6.1.30
Keywords: Cc:
Guest type: other Host type: other

Description

Ever since upgrading to the 21.04 ubuntu my VirtualBox Version 6.1.30 r148432 VMs are experiencing soft lockups:

Jan 5 19:26:24 ubuntu kernel: [483574.483756] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0] Jan 6 11:36:36 ubuntu kernel: [541789.358312] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [swapper/1:0] Jan 6 13:37:15 ubuntu kernel: [549027.924906] watchdog: BUG: soft lockup - CPU#1 stuck for 21s! [swapper/1:0] Jan 6 13:37:15 ubuntu kernel: [549027.925052] do_softirq+0xce/0x281 Jan 6 13:37:15 ubuntu kernel: [549027.925059] do_softirq_own_stack+0x3d/0x50 Jan 7 05:12:07 ubuntu kernel: [605122.533658] watchdog: BUG: soft lockup - CPU#1 stuck for 21s! [swapper/1:0]

Not sure what other information to report, but on an OS X 11.2.1 4 GHz Quad-Core Intel Core i7 host:

$ uname -a Linux ubuntu 5.11.0-41-generic #45-Ubuntu SMP Fri Nov 5 11:37:01 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 21.04 Release: 21.04 Codename: hirsute

$ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-5.11.0-41-generic root=/dev/mapper/cdi--vg-root ro splash quiet

Attachments (4)

VBox.log (96.3 KB ) - added by anhoefler 2 years ago.
VBox log containing multiple lockup occurences
VBoxHardening.log (341.9 KB ) - added by anhoefler 2 years ago.
VBoxHardening log related to the VBox.log that contains multiple lockup occurences
VBox_incl_shutdown_after_lock.log (143.8 KB ) - added by anhoefler 2 years ago.
VBox log containing shutdown logs via VBox UI after guest locked up again
2022-06-11_132405.png (27.9 KB ) - added by Edgar WI 2 years ago.
Windows features on or off

Download all attachments as: .zip

Change History (15)

comment:1 by AA, 2 years ago

This seems to happen most frequently when the host OS is under load, possible including disk access and to external USB drives. Would renice of the VirtualBox process help? Not sure how the hypervisor scheduling priorities relate to user space scheduling on the host.

comment:2 by r19h72, 2 years ago

I've got this issue on one machine only. I tried latest VirtualBox, latest Ubuntu, latest Alpine Linux, two CPUs, 6 CPUs. Under full load the "CPU#0 stuck for 1800s" message appears for CPU 0-5 and time between 37s and 6705s. CPU is Ryzen 5 1600.

comment:3 by anhoefler, 2 years ago

I can confirm this happen very frequently but in my case it's the Guests systems load that's likely causing the soft lockups. On the Host there is (apart from Virtualbox) little activity.

The host system is Windows 10 20H2 (Build 19.042.1526) with 64GB RAM, (32GB of which are still free when the lockups occur) Virtualbox is Virtualbox 6.1.32

Guest: OS: CentOS release 6.10 (Final) Virtualbox Guest tools: 6.1.32
uname -a: Linux dev_box 2.6.32-754.35.1.el6.x86_64 #1 SMP Sat Nov 7 12:42:14 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

I will attach logs from a current occurence where the lockups happened very often during the run.

by anhoefler, 2 years ago

Attachment: VBox.log added

VBox log containing multiple lockup occurences

by anhoefler, 2 years ago

Attachment: VBoxHardening.log added

VBoxHardening log related to the VBox.log that contains multiple lockup occurences

comment:4 by anhoefler, 2 years ago

Maybe some additional observations:

  • this behavior was occuring for us already since I think VB 6.1.28 (but can't exactly tell if it was also with prior versions)
  • I managed to get it unstuck very often by SSHing into the machine with another session (pinging the machine is not enough, it seems that the Guest really needs to do some more processing to get the rest unstuck)
  • it seems to only happen on Windows Hosts (Hyper-V?), some colleagues on Linux hosts did never experience this in the recent weeks (unfortunately we have no Linux hosts around to test&compare anymore)
  • I tried around with provoking the lockup with the "stress" tool but was unable to find a reliable combination that

I also have now logs of a locked machine, that I turned off afterwards using the VBox UI. This contains some more info, maybe it's helpful.

by anhoefler, 2 years ago

VBox log containing shutdown logs via VBox UI after guest locked up again

in reply to:  4 comment:5 by Spiffo, 2 years ago

Replying to anhoefler:

Maybe some additional observations:

  • this behavior was occuring for us already since I think VB 6.1.28 (but can't exactly tell if it was also with prior versions)
  • I managed to get it unstuck very often by SSHing into the machine with another session (pinging the machine is not enough, it seems that the Guest really needs to do some more processing to get the rest unstuck)
  • it seems to only happen on Windows Hosts (Hyper-V?), some colleagues on Linux hosts did never experience this in the recent weeks (unfortunately we have no Linux hosts around to test&compare anymore)
  • I tried around with provoking the lockup with the "stress" tool but was unable to find a reliable combination that

I also have now logs of a locked machine, that I turned off afterwards using the VBox UI. This contains some more info, maybe it's helpful.

I had luck getting the Virtual Machines to behave by disabling Windows Sanbox, Microsoft Defender Application Guard and Virtual Machine Platform in Windows Features (Turn Windows features on or off dialog) main culprit is Defender Application Guard as it runs applications sanboxed and is enabled by default on Windows 11 Pro and it was the last thing i disabled.

Host os: Windows 11 Version 21H2 Build 22000.613 VirtualBox 6.1.32 r149290

Windows Sanbox is a modified Hyper-V vm shipped with Windows 10/11 Pro so make sure you have it disabled when using VirtualBox.

Hope this information helps

comment:6 by Kotori, 2 years ago

Host Windows 10 21H2 19044.1645 VirtualBox 6.1.32 r149290 Guest Debian 11

Had this issue, removed Windows Sandbox from the Host and rebooted. Fixed.

comment:7 by anhoefler, 2 years ago

Unfortunately removing the Windows Sandbox did not solve the issues on our side. One colleague had that feature not even installed, on my machine I could remove it but it didn't change the behavior.

I tried in the meantime two further things:

  • set 'VBoxManage setextradata "<VM Name>" "VBoxInternal/NEM/UseRing0Runloop" 0'

this changed the behavior to a worse state. The VM started somewhat but either locked up completely pretty soon (or became so slow that I couldn't distinguish from a locked state)

  • increased the number of virtual CPUs to 6 (the physical number on my machine)

this did not solve the soft-lockup but improved the situation somewhat in that the locked CPUs get woken up more regularly by activity on other CPUs (I guess). It lowers the probability that all CPUs lock up at the same time and bring the whole VM to a halt.

So unfortunately no change for now for us.

Last edited 2 years ago by anhoefler (previous) (diff)

by Edgar WI, 2 years ago

Attachment: 2022-06-11_132405.png added

Windows features on or off

comment:8 by Edgar WI, 2 years ago

Up to yesterday evening Virtual Box was running perfectly under Windows 11. This morning there were some Windows updates & patches applied (company policy) ... and then I could not start any of my virtual images (Windows Server, RedHat, Debian, etc).

There is no issue with the virtual images, as they under Slackware 14.2 runs perfectly.

After applying all the above suggestions - one by one and rebooting the machine eacht time - I was back in business.

The last action was to turn off "Microsoft Defender Application Guard". But I do not know it only the last action was needed ...

comment:9 by fth0, 2 years ago

The key is to disable Hyper-V completely on the Windows host, which can be tricky and often involves more than just disabling Windows features. See [hhttps://forums.virtualbox.org/viewtopic.php?f=25&t=99390 HMR3Init: Attempting fall back to NEM (Hyper-V is active)] for details.

An alternative would be trying the VirtualBox test build 6.1.35r151573 (or newer).

comment:10 by anhoefler, 22 months ago

With 6.1.36 and the fix of #20787 this one seems now also resolved for us. Haven't had a single occurence anymore since the update.

comment:11 by galitsyn, 22 months ago

Resolution: fixed
Status: newclosed

Thank you for the feedback. Closing.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use