VirtualBox

Ticket #20090 (new defect)

Opened 8 months ago

Last modified 4 days ago

Black screen on EFI VM

Reported by: CornFlake Owned by:
Component: EFI Version: VirtualBox 6.1.16
Keywords: Cc:
Guest type: other Host type: Linux

Description

New laptop with Intel i5-1035G1 CPU. Attempting to boot an EFI VM results in a black screen and 100% CPU utilization, regardless of guest OS. The EFI shell doesn't come up either. Discussed in this thread:  https://forums.virtualbox.org/viewtopic.php?f=7&t=100940

Attached is a copy of a blank VM with EFI enabled and its logs. I'm also attaching a copy of dmesg, top, and top+H.

Attachments

Test.zip Download (66.4 KB) - added by CornFlake 8 months ago.
Blank VM with EFI enabled
dmesg.txt Download (67.9 KB) - added by CornFlake 8 months ago.
top.txt Download (10.0 KB) - added by CornFlake 8 months ago.
top+H.txt Download (10.0 KB) - added by CornFlake 8 months ago.
win10.tar.gz Download (30.2 KB) - added by mikeg87 5 months ago.
win10.tar.2.gz Download (30.2 KB) - added by mikeg87 5 months ago.
cpuinfo Download (13.9 KB) - added by mikeg87 5 months ago.
fix_virtualbox.patch Download (473 bytes) - added by Razorback 3 weeks ago.
Kernel patch for VB

Change History

Changed 8 months ago by CornFlake

Blank VM with EFI enabled

Changed 8 months ago by CornFlake

Changed 8 months ago by CornFlake

Changed 8 months ago by CornFlake

comment:1 Changed 8 months ago by fth0

As can be seen from my posts in the forum thread, VirtualBox doesn't know several CPUID Structured Extended Feature Flags. Additionally, the vCPU is running endlessly in the midst of the (U)EFI BIOS loading (possibly somewhere near the watchdog and timer initialization, but I'm not sure about that).

Last edited 8 months ago by fth0 (previous) (diff)

Changed 5 months ago by mikeg87

Changed 5 months ago by mikeg87

Changed 5 months ago by mikeg87

comment:2 Changed 5 months ago by mikeg87

I ran into very similar 'black screen' behavior with an 11-th gen i7. I've attached a test VM that re-creates this issue along with a dump of /proc/cpuinfo in case that is helpful. This VM works fine on another machine with older gen i5.

comment:3 Changed 4 months ago by grahamperrin

Cc: grahamperrin

comment:4 Changed 4 months ago by FreddyW

Same here, black screen on a laptop with an 11-th gen i7 using EFI, everything works fine on a 7-th and 8-th gen i5. Host systems used are Fedora 33 and 34 (both exhibit the same behaviour), guest systems used/tested are Fedora 32,33 and 34 and Windows.

Last edited 4 months ago by FreddyW (previous) (diff)

comment:5 Changed 3 months ago by Hawk128

Hi All, I can confirm the issue. I used Dell 5510 laptop on Ubuntu 20.04 with i7-10850H - VirtualBox worked well. I moved whole system to a new Dell 5520 with i7-1185G7 - and I have the issue.

comment:6 Changed 3 months ago by dgrisby

I also have this problem on a Dell XPS 13 with an 11th gen i7-1165G7, running Fedora 33. I also see the thread named "EMT" spinning at 100% CPU. In case it is helpful, here is a gdb backtrace of the thread:

#0  0x00007fe66d55f5db in ioctl () from /lib64/libc.so.6
#1  0x00007fe66d35e4ed in suplibOsIOCtlFast (pThis=<optimized out>, uFunction=<optimized out>, 
    idCpu=<optimized out>)
    at /usr/src/debug/VirtualBox-6.1.18-1.fc33.x86_64/src/VBox/HostDrivers/Support/linux/SUPLib-linux.cpp:212
#2  0x00007fe64880e8d9 in VMMR3HmRunGC (pVM=0x7fe64806e000, pVCpu=0x7fe648059000)
    at /usr/src/debug/VirtualBox-6.1.18-1.fc33.x86_64/src/VBox/VMM/VMMR3/VMM.cpp:1100
#3  0x00007fe6487a5eb4 in emR3HmExecute (pVM=0x7fe64806e000, pVCpu=0x7fe648059000, pfFFDone=0x7fe6250c9d57)
    at /usr/src/debug/VirtualBox-6.1.18-1.fc33.x86_64/src/VBox/VMM/VMMR3/EMHM.cpp:418
#4  0x00007fe6487a3c3e in EMR3ExecuteVM (pVM=<optimized out>, pVCpu=<optimized out>)
    at /usr/src/debug/VirtualBox-6.1.18-1.fc33.x86_64/src/VBox/VMM/VMMR3/EM.cpp:2657
#5  0x00007fe648806990 in vmR3EmulationThreadWithId (hThreadSelf=<optimized out>, pUVCpu=0x7fe6250d16a0, idCpu=0)
    at /usr/src/debug/VirtualBox-6.1.18-1.fc33.x86_64/src/VBox/VMM/VMMR3/VMEmt.cpp:237
#6  0x00007fe66d29b7d4 in rtThreadMain (pThread=0x7fe6140062f0, NativeThread=<optimized out>, 
    pszThreadName=<optimized out>)
    at /usr/src/debug/VirtualBox-6.1.18-1.fc33.x86_64/src/VBox/Runtime/common/misc/thread.cpp:727
#7  0x00007fe66d3534ee in rtThreadNativeMain (pvArgs=0x7fe6140062f0)
    at /usr/src/debug/VirtualBox-6.1.18-1.fc33.x86_64/src/VBox/Runtime/r3/posix/thread-posix.cpp:384
#8  0x00007fe66d6423f9 in start_thread () from /lib64/libpthread.so.0
#9  0x00007fe66d568b53 in clone () from /lib64/libc.so.6

I'm happy to collect any other information that might be helpful.

Last edited 3 months ago by dgrisby (previous) (diff)

comment:7 Changed 3 months ago by dgrisby

A further possibly interesting piece of information: I booted an EFI-using VM on a host machine with an i7-9700K, then saved its state. I copied all the files across to the machine with the i7-1165G7 and the VM happily resumed and appeared to be working fine. That suggests that the problem is just with the EFI boot, and that if the VM got past that stage, it would work for everything else.

comment:8 Changed 3 months ago by fth0

Please check if this issue is solved by VirtualBox 6.1.20.

comment:9 Changed 3 months ago by FreddyW

VirtualBox 6.1.20 does not solve this issue for me. BIOS fine, EFI not.

comment:10 Changed 3 months ago by Hawk128

virtualbox-6.1_6.1.20-143896_Ubuntu_eoan_amd64.deb does not solve this issue for me too.

Last edited 3 months ago by Hawk128 (previous) (diff)

comment:11 Changed 3 months ago by FreddyW

VirtualBox 6.1.22 does not solve this issue as well. Guess this one is really hard to crack for some reason.

comment:12 Changed 2 months ago by FreddyW

Any news on this one? The bug report is open for 5 months now. I am forced to use an old laptop now for VirtualBox. I am really thinking about ditching VB and switching to VMware-Player, Gnome Boxes or Virt-Manager.

comment:13 Changed 2 months ago by Hawk128

I switched to virt-manager - works even better for me...

comment:14 Changed 7 weeks ago by fth0

I'll make an educated guess:

Please reboot your Linux host, providing the kernel parameter split_lock_detect=off. Does the issue still occur?

comment:15 Changed 7 weeks ago by luizfernando

I was having the same issue, and the change in kernel parameter proposed by fth0 solved it.

P.S.: Before trying this, I've tried to use the developer snapshot 6.1.97-142300, it solve the EFI issue, but on my machine created a lot of new issues.

My system is:
Computer: Acer Aspire A515-55
S.O.: Fedora 34 Workstation
Processor: Intel® Core™ i5-1035G1 CPU @ 1.00GHz × 8
Graphics: Mesa Intel® UHD Graphics (ICL GT1)
Kernel: 5.12.8
VirtualBox Version: 6.1.22

comment:16 Changed 7 weeks ago by dgrisby

An initial test suggests that split_lock_detect=off also solves the issue for me on a Dell XPS-13 with 11th gen i7-1165G7 running Fedora 34. Thanks for the suggestion fth0!

comment:17 Changed 7 weeks ago by seblu

split_lock_detect=off didn't fix the issue on my Dell XPS 13 9310 (i7-1185G7)

Some system outputs

# uname -a
Linux foo 5.13.0-rc4-seblu #1 SMP PREEMPT Mon May 31 03:25:45 CEST 2021 x86_64 GNU/Linux
# cat /proc/cmdline
i915.enable_psr=0 split_lock_detect=off quiet
# dmesg|tail
[  629.781305] vboxdrv: 000000003d4de0d3 VBoxEhciR0.r0
[  629.782757] VMMR0InitVM: eflags=246 fKernelFeatures=0x0 (SUPKERNELFEATURES_SMAP=0)
[  629.801848] device vboxnet0 entered promiscuous mode
[  633.598424] device vboxnet0 left promiscuous mode
[  633.709048] 
               !!Assertion Failed!!
               Expression: pCritSect->s.Core.NativeThreadOwner == hNativeSelf
               Location  : /build/virtualbox/src/VirtualBox-6.1.22/src/VBox/VMM/VMMAll/PDMAllCritSect.cpp(575) int PDMCritSectLeave(PPDMCRITSECT)
[  633.709060] ffffbbd4c5d71000 <R3_STRING>: ffffffffffffffff != 00007ff04c146640; cLockers=-1 cNestings=1
[  633.729645] vboxnetflt: 0 out of 6 packets were not sent (directed to host)

Vbox logs

00:00:04.510348 VUSB: Detached 'HidMouse' from port 1 on RootHub#1
00:00:04.510410 VUSB: Detached 'HidMouse' from port 2 on RootHub#1
00:00:04.510795 
00:00:04.510795 !!R0-Assertion Failed!!
00:00:04.510796 Expression: pCritSect->s.Core.NativeThreadOwner == hNativeSelf
00:00:04.510796 Location  : /build/virtualbox/src/VirtualBox-6.1.22/src/VBox/VMM/VMMAll/PDMAllCritSect.cpp(575) int PDMCritSectLeave(PPDMCRITSECT)
00:00:04.510800 ffffbbd4c78fd000 <R3_STRING>: ffffffffffffffff != 00007f304c130640; cLockers=-1 cNestings=1
00:00:04.566137 GIM: KVM: Resetting MSRs
00:00:04.567177 Changing the VM state from 'DESTROYING' to 'TERMINATED'
00:00:04.571336 Console: Machine state changed to 'PoweredOff'
00:00:05.093529 GUI: Passing request to close Runtime UI from machine-logic to UI session.
00:00:05.093926 GUI: Aborting startup due to invalid machine state detected: 1

comment:18 Changed 5 weeks ago by CornFlake

I can also confirm that "split_lock_detect=off" solved the issue from my original post. Thanks fth0!

comment:19 follow-up: ↓ 20 Changed 4 weeks ago by Tatsh

I have a much older CPU (i7-5930K) so split_lock_detect=off does not apply to mine (and does not fix this issue). I am getting the same assertion error in kernel log for EFI-based VMs.

I am on Linux 5.13.0 (Gentoo) with VirtualBox 6.1.22. Everything is up-to-date.

 $ uname -a
Linux limelight 5.13.0-gentoo-limelight #2 SMP Wed Jun 30 01:24:37 EDT 2021 x86_64 Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz GenuineIntel GNU/Linux
tatsh@limelight ~
 $ journalctl -n 15 -k
-- Journal begins at Wed 2021-01-20 07:08:58 EST, ends at Wed 2021-06-30 01:54:56 EDT. --
Jun 30 01:37:56 limelight kernel:
                                  !!Assertion Failed!!
                                  Expression: pCritSect->s.Core.NativeThreadOwner == hNativeSelf
                                  Location  : /var/tmp/portage/app-emulation/virtualbox-6.1.22/work/VirtualBox-6.1.22/src/VBox/VMM/VMMAll/PDMAllCritSect.cpp(575) int PDMCritSectLe>
Jun 30 01:37:56 limelight kernel: ffffb28e4cd6d000 <R3_STRING>: ffffffffffffffff != 00007f25141dc640; cLockers=-1 cNestings=1
Jun 30 01:38:05 limelight kernel: vboxdrv: 00000000c31ac7c3 VMMR0.r0
Jun 30 01:38:05 limelight kernel: vboxdrv: 00000000e8839b3f VBoxDDR0.r0
Jun 30 01:38:05 limelight kernel: VMMR0InitVM: eflags=246 fKernelFeatures=0x0 (SUPKERNELFEATURES_SMAP=0)
Jun 30 01:38:06 limelight kernel:
                                  !!Assertion Failed!!
                                  Expression: pCritSect->s.Core.NativeThreadOwner == hNativeSelf
                                  Location  : /var/tmp/portage/app-emulation/virtualbox-6.1.22/work/VirtualBox-6.1.22/src/VBox/VMM/VMMAll/PDMAllCritSect.cpp(575) int PDMCritSectLe>
Jun 30 01:38:06 limelight kernel: ffffb28e45cd0000 <R3_STRING>: ffffffffffffffff != 00007fe7ac199640; cLockers=-1 cNestings=1
Jun 30 01:38:56 limelight kernel: vboxdrv: 00000000b726513d VMMR0.r0
Jun 30 01:38:56 limelight kernel: vboxdrv: 0000000038008452 VBoxDDR0.r0
Jun 30 01:38:56 limelight kernel: vboxdrv: 00000000e77c1821 VBoxEhciR0.r0
Jun 30 01:38:56 limelight kernel: VMMR0InitVM: eflags=246 fKernelFeatures=0x0 (SUPKERNELFEATURES_SMAP=0)
Jun 30 01:39:19 limelight kernel: vboxdrv: 0000000046693f48 VMMR0.r0
Jun 30 01:39:19 limelight kernel: vboxdrv: 00000000d91ea0cd VBoxDDR0.r0
Jun 30 01:39:19 limelight kernel: vboxdrv: 00000000dc26d0ad VBoxEhciR0.r0
Jun 30 01:39:19 limelight kernel: VMMR0InitVM: eflags=246 fKernelFeatures=0x0 (SUPKERNELFEATURES_SMAP=0)

comment:20 in reply to: ↑ 19 ; follow-up: ↓ 24 Changed 4 weeks ago by fth0

Replying to Tatsh:

I have a much older CPU (i7-5930K) so split_lock_detect=off does not apply to mine (and does not fix this issue). I am getting the same assertion error in kernel log for EFI-based VMs.

I agree that you and seblu get the same assertion failure, but I don't see it in the dmesg.log of CornFlake. So it is possible that you both are having a different issue.

Changed 3 weeks ago by Razorback

Kernel patch for VB

comment:21 follow-up: ↓ 22 Changed 3 weeks ago by Razorback

This patch didn't help :(

comment:22 in reply to: ↑ 21 Changed 3 weeks ago by fth0

Replying to Razorback:

This patch didn't help :(

Are you posting to the correct issue (I don't see any relation)?

Last edited 3 weeks ago by fth0 (previous) (diff)

comment:23 Changed 3 weeks ago by Razorback

I've just had the same

!!Assertion Failed!!
                                  Expression: pCritSect->s.Core.NativeThreadOwner == hNativeSelf
                                  Location  : /var/tmp/portage/app-emulation/virtualbox-6.1.22/work/VirtualBox-6.1.22/src/VBox/VMM/VMMAll/PDMAllCritSect.cpp(575) int PDMCritSectLe>
Jun 30 01:38:06 limelight kernel: ffffb28e45cd0000 <R3_STRING>: ffffffffffffffff != 00007fe7ac199640; cLockers=-1 cNestings=1

and thought that is it

Last edited 3 weeks ago by Razorback (previous) (diff)

comment:24 in reply to: ↑ 20 ; follow-ups: ↓ 25 ↓ 28 Changed 3 weeks ago by Tatsh

Replying to fth0:

Replying to Tatsh:

I have a much older CPU (i7-5930K) so split_lock_detect=off does not apply to mine (and does not fix this issue). I am getting the same assertion error in kernel log for EFI-based VMs.

I agree that you and seblu get the same assertion failure, but I don't see it in the dmesg.log of CornFlake. So it is possible that you both are having a different issue.

What information can I provide to get further assistance?

comment:25 in reply to: ↑ 24 Changed 3 weeks ago by fth0

Replying to Tatsh:

Replying to fth0:

Replying to Tatsh:

I have a much older CPU (i7-5930K) so split_lock_detect=off does not apply to mine (and does not fix this issue). I am getting the same assertion error in kernel log for EFI-based VMs.

I agree that you and seblu get the same assertion failure, but I don't see it in the dmesg.log of CornFlake. So it is possible that you both are having a different issue.

What information can I provide to get further assistance?

I'd suggest that you create your own bug ticket with a title and a description more specific to your problem (e.g. mentioning the assertion regarding NativeThreadOwner), and hope for a VirtualBox developer getting interested.

comment:26 Changed 3 weeks ago by Tatsh

fth0 Kernel 5.13 was marked as bad by Gentoo due to this issue:  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ff4b2b4014cbffb3d32b22629252f4dc8616b0fe

Does it seem related?

I downgraded back to 5.12 so now VirtualBox works fine. I can test 5.13 when the next stable release comes out.

comment:27 Changed 3 weeks ago by Tatsh

Tested and that's not the issue.

comment:28 in reply to: ↑ 24 Changed 2 weeks ago by fth0

Replying to Tatsh:

What information can I provide to get further assistance?

Can you provide a VBox.log file from a failing VM run and the corresponding part of the kernel log?

Last edited 2 weeks ago by fth0 (previous) (diff)

comment:29 Changed 2 weeks ago by fth0

Please try one of the VirtualBox test builds 6.1.23r145550 or later, which should solve the issue of the OP and others, but not the issues with Linux kernel 5.13, and report back. TIA.

comment:30 follow-up: ↓ 32 Changed 11 days ago by bird

The problem with 5.13 is the addition and enabling of CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. Adding randomize_kstack_offset=0 to the linux kernel command line should disable this and avoid the VERR_INVALID_STATE guru.

comment:31 Changed 11 days ago by fth0

Please try the VirtualBox test builds 6.1.23r145697 (or newer) that are supposed to fix VirtualBox issues on hosts with Linux kernel 5.13, and report back. TIA.

comment:32 in reply to: ↑ 30 ; follow-up: ↓ 33 Changed 11 days ago by Tatsh

Replying to bird:

The problem with 5.13 is the addition and enabling of CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. Adding randomize_kstack_offset=0 to the linux kernel command line should disable this and avoid the VERR_INVALID_STATE guru.

This fixed my issue. I rebuilt my kernel with CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=n and now VirtualBox EFI machines start fine.

Replying to fth0:

Please try the VirtualBox test builds 6.1.23r145697 (or newer) that are supposed to fix VirtualBox issues on hosts with Linux kernel 5.13, and report back. TIA.

Happy to test but are these new builds supposed to fix for users using CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y on 5.13?

comment:33 in reply to: ↑ 32 Changed 11 days ago by fth0

Replying to Tatsh:

Replying to fth0:

Please try the VirtualBox test builds 6.1.23r145697 (or newer) that are supposed to fix VirtualBox issues on hosts with Linux kernel 5.13, and report back. TIA.

Happy to test but are these new builds supposed to fix for users using CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y on 5.13?

The new test builds are supposed to make adding randomize_kstack_offset=0 to the linux kernel command line unnecessary.

comment:34 Changed 11 days ago by klaus

To state it clearly: yes, 6.1.23r145697 (or newer) works with Linux 5.13 with default config (CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y ), not needing any kernel command line tweaking.

comment:35 Changed 4 days ago by Tatsh

I updated to 6.1.24 and re-enabled CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT and everything works like normal.

comment:36 Changed 4 days ago by grahamperrin

<https://www.virtualbox.org/wiki/Changelog-6.1#v24> (2021-07-20) references this bug:

EFI: Stability improvements (bug #20090)

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use