VirtualBox

Ticket #19171 (new defect)

Opened 7 months ago

Last modified 5 months ago

On Ubuntu 18 host, Win10-64bit guest goes to GURU with VERR_VMX_UNABLE_TO_START_VM

Reported by: juhan Owned by:
Component: other Version: VirtualBox 6.1.0
Keywords: guru,VERR_VMX_UNABLE_TO_START_VM Cc: j.uha@…
Guest type: Windows Host type: Linux

Description

Dear community,

first of all, I'd like to thank all of you for supporting this awesome project! I'm very grateful being able to use such a great software in our environment.

Now to my problem. I'm having trouble with a Win10 64-Bit guest on Ubuntu 18.04.3 LTS. The machine has a rather new Intel Core i7-9700, with 8 cores, and 32GB RAM. VT is available and turned on. The guest gets 8 logical cores and 16GB RAM.

The VM starts normally and works as expected. Everything worked fine for weeks just until I ran the update of a DATEV software collection (a huge accounting software from germany), installed on the guest. The update runs as expected, w/o error until one particular point where the VM changes to GURU meditation. This procedure is reproducible.

Having a look at the logs shows that 'VERR_VMX_UNABLE_TO_START_VM' might be the issue here. Looking deeper shows, that VT-x is supported, active and running as expected.

Two particular log lines remind me of  this issue I found, when searching for a solution:

00:01:11.851397 HM: VERR_VMX_UNABLE_TO_START_VM: VM-entry allowed-1  0x3ffff
00:01:11.851399 HM: VERR_VMX_UNABLE_TO_START_VM: VM-entry allowed-0  0x11ff

Unfortunately, as far as I understand the discusstion of the issue above, it was never solved?!

I honestly can't interpret these lines, but aren't they contradictory? VMX could be allowed or not, but both?! Could it be, that the DATEV update tries to start a VM inside the guest somehow? I activated Hyper-V on guest os btw..

I already tried to activate/deactivate different settings around VT-x support including "VT-X/AMD-V", "Nested VT-X/AMD-V", "PAE/NX", "Nested Paging" etc., without any changes.

I would be very greatful for any help, thank you very much in advance!

Best regards

PS: If this is the wrong place for this issue, I apologize politely and will post my problem in the forum.

Attachments

VBox.zip Download (148.9 KB) - added by juhan 7 months ago.
VBox.log (zipped because of size restrictions)
DATEV_Installer_VBox.log.gz Download (100.2 KB) - added by softjury 5 months ago.
DATEV_Installer_VBox.png Download (387.6 KB) - added by softjury 5 months ago.

Change History

Changed 7 months ago by juhan

VBox.log (zipped because of size restrictions)

comment:1 Changed 6 months ago by juhan

I rolled back to version 6.0.14 and the system seems to run stable. Guru Meditation state did not occur since then.

comment:2 Changed 6 months ago by aeichner

Do you have anything else running on the host which could use the hardware virtualization support of your CPU, any KVM/qemu guests running for instance?

comment:3 Changed 5 months ago by fth0

I have analyzed multiple VBox.log files from 2 forum threads (of 4 current ones that I'm aware of) regarding this problem ( viewtopic.php?f=2&t=96077,  viewtopic.php?f=2&t=96556). My findings (from low level to high level):

VCPU0: Guru Meditation -4005 (VERR_VMX_UNABLE_TO_START_VM)
CPUM0: Disas -> VERR_DIS_INVALID_OPCODE; f3 64 f1 c7 45 fc fe ff ff ff c7 45 fc 01 00 00
VMX_EXIT_XCPT_OR_NMI - 0 - Exception or non-maskable interrupt (NMI).
  • Independent of the number of vCPUs used, one of them is in 32-bit mode, VirtualBox has intercepted an ICEBP instruction (repz fs icebp to be precise), and is failing at the next VM-Enter. FWIW, there was a possibly related problem in VirtualBox 4.3 (see #12410), addressed by ramshankar.
  • The crash happens when running a DATEV proprietary installer for the Microsoft SQL Server Express 2014 (ENU) embedded in DATEV software. As a test, I downloaded a standalone installer from Microsoft, which installed without a problem on my VMs (mind that I didn't observe/reproduce the original problem myself). While the standalone Microsoft installers are separate for 64-bit and 32-bit, the DATEV installer is only one binary, so I could imagine some 64/32-bit checks taking place.
  • The crash reportedly happens on VirtualBox 6.1.x, but not on VirtualBox 6.0.14. Guests are Windows 10, hosts are Linux, CPUs are rather new.

Edit: Happens also on Windows hosts with ICEBP alone (like in #12410) and another application (CorelDRAW Graphics Suite X5) in the Windows guest:  viewtopic.php?f=2&t=96519

Last edited 5 months ago by fth0 (previous) (diff)

comment:4 follow-up: ↓ 6 Changed 5 months ago by ramshankar

This looks like a regression.

Is there an easy way to reproduce this? I'm not sure where to obtain this "DATEV" software.

comment:5 Changed 5 months ago by softjury

+1 for being affected, running DATEV Software inside Win10 on VBox 6.1.2 on Ubuntu 16.04 LTS.

comment:6 in reply to: ↑ 4 ; follow-up: ↓ 7 Changed 5 months ago by fth0

Replying to ramshankar:

Is there an easy way to reproduce this? I'm not sure where to obtain this "DATEV" software.

User softjury replied to my corresponding question in the 1st forum thread, see  https://forums.virtualbox.org/viewtopic.php?f=2&t=96077&start=15#p468936. I can't tell if it fits your 'easy' requirement, though, and I don't have time to try it myself tonight.

Do you think it would be an alternative reproduction method to take any Windows executable and replace a CPU instruction with ICEBP? Or do you think the ICEBP is planted by VirtualBox?

BTW, in the VBox.log files I've analyzed, the CPU instructions after the ICEBP never make sense, so I think we are either looking at (anti) debugging techniques or at no code at all.

Last edited 5 months ago by fth0 (previous) (diff)

comment:7 in reply to: ↑ 6 ; follow-up: ↓ 10 Changed 5 months ago by ramshankar

Replying to fth0:

Replying to ramshankar:

Is there an easy way to reproduce this? I'm not sure where to obtain this "DATEV" software.

User softjury replied to my corresponding question in the 1st forum thread, see  https://forums.virtualbox.org/viewtopic.php?f=2&t=96077&start=15#p468936. I can't tell if it fits your 'easy' requirement, though, and I don't have time to try it myself tonight.

Do you think it would be an alternative reproduction method to take any Windows executable and replace a CPU instruction with ICEBP? Or do you think the ICEBP is planted by VirtualBox?


VirtualBox does not inject any ICEBP instruction under normal operation.


BTW, in the VBox.log files I've analyzed, the CPU instructions after the ICEBP never make sense, so I think we are either looking at (anti) debugging techniques or at no code at all.


I already tried reproducing the problem with just ICEBP as well as the sequence 'f3 64 f1' (which doesn't make sense to me either). Regardless, neither of these cause any guru meditation on my Windows 10 VM (on an Intel Skylake CPU). So there must be something more to it.

Update: I also now tried CorelDraw x5 (installation and launching CorelDraw x5 and Corel Photo-Paint) both seem to work fine here. Windows 10 host and guest (Skylake CPU).

Last edited 5 months ago by ramshankar (previous) (diff)

comment:8 follow-up: ↓ 12 Changed 5 months ago by softjury

Host CPU affected is Ivy Bridge

vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
stepping	: 9
microcode	: 0x21

Same issue on Kaby Lake

vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
stepping	: 9
microcode	: 0xca

ISO for reproduction on Windows 10

 https://dvd-download-akamai.iso-download-datev.de/srv01/l8nF3f5LXqOvgvhP/DATEV_Mittelstand_Faktura_mit_Rechnungswesen_compact_88980_0820_01_20191220_195720.ISO (5.3 GB)

Last edited 5 months ago by softjury (previous) (diff)

comment:9 Changed 5 months ago by softjury

I can reproduce the Crash

  • Install Virtualbox 6.1.2 on Ubuntu Xenial 16.04
  • Take Snapshot
  • Insert  DATEV ISO into virtual CD drive
  • Run the installer, click
    • "Installation starten"
    • "Weiter >"
    • "Fertig stellen"
    • "Weiter >"
  • When the installer installs SQL Server für DATEV (Express)
    • Installer will reboot windows once
    • After reboot installer resumes
    • VirtualBox crashes
Last edited 5 months ago by softjury (previous) (diff)

Changed 5 months ago by softjury

Changed 5 months ago by softjury

comment:10 in reply to: ↑ 7 Changed 5 months ago by fth0

Replying to ramshankar:

I already tried reproducing the problem with just ICEBP as well as the sequence 'f3 64 f1' (which doesn't make sense to me either).

Regarding this one on its own, I'd have an educated guess: if you have a string instruction that contains two prefixes like "REPZ FS CMPSB", a debugger setting a breakpoint maybe replaces the real instruction (CMPSB) with the ICEBP instruction.

comment:11 Changed 5 months ago by fth0

There is a new aspect in the VBox.log file from user softjury. vCPU2 had several VMX_EXIT_XCPT_OR_NMI pairs without a crash, note the RIP value:

00:15:27.619330 CPU[2]: VM-exit history:
00:15:27.619331    Exit No.:     TSC timestamp / delta    RIP (Flat/*)      Exit    Name
00:15:27.619333     7284408: 0x00004ad06ce87b78/+0        00000000676f140a  0x01000 VMX_EXIT_XCPT_OR_NMI - 0 - Exception or non-maskable interrupt (NMI).
00:15:27.619340     7284407: 0x00004ad06ce688e4/-127636   ffffffffffffffff  0x01000 VMX_EXIT_XCPT_OR_NMI - 0 - Exception or non-maskable interrupt (NMI).
...
00:15:27.619720     7284281: 0x00004ad06a3dd3a2/-76058    ffffffffffffffff  0x01000 VMX_EXIT_XCPT_OR_NMI - 0 - Exception or non-maskable interrupt (NMI).
00:15:27.619723     7284280: 0x00004ad06a3ba156/-143948   ffffffffffffffff  0x01000 VMX_EXIT_XCPT_OR_NMI - 0 - Exception or non-maskable interrupt (NMI).
...
00:15:27.620127     7284190: 0x00004ad0687bf7aa/-715036   ffffffffffffffff  0x01000 VMX_EXIT_XCPT_OR_NMI - 0 - Exception or non-maskable interrupt (NMI).
00:15:27.620130     7284189: 0x00004ad06879cece/-141532   ffffffffffffffff  0x01000 VMX_EXIT_XCPT_OR_NMI - 0 - Exception or non-maskable interrupt (NMI).

comment:12 in reply to: ↑ 8 Changed 5 months ago by ramshankar

ISO for reproduction on Windows 10

 https://dvd-download-akamai.iso-download-datev.de/srv01/l8nF3f5LXqOvgvhP/DATEV_Mittelstand_Faktura_mit_Rechnungswesen_compact_88980_0820_01_20191220_195720.ISO (5.3 GB)

Sorry but I cannot download this ISO. I get the error:

" You don't have permission to access "http://dvd-download-akamai.iso-download-datev.de/srv01/l8nF3f5LXqOvgvhP/DATEV_Mittelstand_Faktura_mit_Rechnungswesen_compact_88980_0820_01_20191220_195720.ISO" on this server."

I don't know if this DATEV software is free but please do NOT post links to commercial software that requires a license to legally download or use.

comment:13 Changed 5 months ago by softjury

The DATEV Software is perfectly legal to download and use for 30 days. After 30 days you need a code to further use the software.

It looks like they geo-disabled downloading the ISO outside germany to cut cost on traffic considering the accounting software offered is only used by customers under german jurisdiction regarding tax laws.

Please use  this download URL instead. This is a reverse proxy server located in Germany which makes it possible for you to download the DVD ISO.

Disclaimer for the lawyers reading this: NO, that  reverse proxy server doesn't host that ISO file. The reverse proxy server is merely downloading chunks from the origin server and passing that data as is to the requesting client WITHOUT saving it. At no time has the reverse proxy server more than a tiny fraction of that ISO file in it's volatile memory.

Last edited 5 months ago by softjury (previous) (diff)

comment:14 Changed 5 months ago by juhan

Let me add one thing, in case it matters. When trying to find a solution back than, I reduced the number of cores from 8 to 4. Afterwards, DATEV also started to crash randomly when just using the software, not only when trying to apply the update.

With 8 cores, at least using the software didn't cause any trouble.

comment:15 Changed 5 months ago by ramshankar

We've finally managed to reproduce this issue. The DATEV program as well as CorelDraw X5 are affected.

It seems increasing the VCPU count really helps because it seems to drastically increase the chance of the VMM exiting to ring-3. And the bug (regression) exits only in the exit-to-ring 3 case when ICEBP #DB VM-exits occurs.

The fix will be available in the next maintenance release of VirtualBox.

Thank you all for your assistance.

Last edited 5 months ago by ramshankar (previous) (diff)

comment:16 Changed 5 months ago by michaln

For the record, the f3 64 f1 sequence is a valid instruction (ICEBP with redundant/ignored prefixes). Many disassemblers consider it invalid but CPUs execute it. It is one of the many very badly documented aspects of the x86 instruction set.

Why the DATEV software would be using it is a good question, and it does look like some kind of anti-debugging attempt.

This bug had nothing to do with the funny prefixes though, only with ICEBP. As explained above, it didn't happen always, or not even most of the time.

comment:17 Changed 5 months ago by ramshankar

Test builds with the fix are now temporarily available:

Windows host (32-bit/64-bit):
https://www.virtualbox.org/download/testcase/VirtualBox-6.1.3-135953-Win.exe

Linux host (64-bit):
https://www.virtualbox.org/download/testcase/VirtualBox-6.1.3-135953-Linux_amd64.run

The above links automatically expiry in ~14 days. Feel free to test and provide feedback.

comment:18 Changed 5 months ago by softjury

I can confirm the test build provided by @ramshankar doesn't crash with DATEV software installation. However, I didn't try running the production system on the test build, just checked the DATEV installer.

Last edited 5 months ago by softjury (previous) (diff)

comment:19 Changed 5 months ago by juhan

Glad to here that you could fix this issue, thank you very much!

Since I already did the DATEV update with 6.0.14 on our production system, I cannot easily check if the issue has gone. However, I downloaded the provided temporary image and will try it as soon as our accountant is relaxed enough to not freak out if the system crashes again ;)

comment:20 Changed 5 months ago by der_reisende

At least on my system I can confirm that the fix worked for an existing installation of Datev Reisekosten Vorerfassung V 2.5. The runtime crashed with 6.1.2, works with the test build.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use