VirtualBox

Opened 9 years ago

Last modified 8 years ago

#14075 new defect

Windows VM crashes Debian host, NMI for unknown reason points to "vboxdrv"

Reported by: Denis Kozlov Owned by:
Component: other Version: VirtualBox 4.3.26
Keywords: Crash, Dump, NMI, Windows, Linux, Debian, Interrupts, vboxdrv Cc:
Guest type: Windows Host type: Linux

Description

The first time the crash occurred was about a month ago, the host was running Debian 6 and latest (at that time) VirtualBox. Since then, host was wiped and Debian 7 was installed with latest VirtualBox 4.3.26. A crash occurred again recently and so I started investigating.

Crashes seem random. Windows VM was the only VM on the host at the time of the second crash, it was just applying Windows updates. Reverting to a previous copy of VM and reapplying the same Windows updates again did not cause the crash again. Nothing appears in the logs at the time of a crash (both Linux and VirtualBox logs).

The suspect lines extracted from dmesg:

[50414.144741] warning: `VBoxHeadless' uses 32-bit capabilities (legacy support in use)
[50415.090400] EXT4-fs (md0): Unaligned AIO/DIO on inode 4194333 by AioMgr0-N; performance will be poor.
...
[63477.985050] Uhhuh. NMI received for unknown reason 31 on CPU 4.
[63477.985073] Do you have a strange power saving mode enabled?
[63477.985094] Dazed and confused, but trying to continue
[68728.724996] Uhhuh. NMI received for unknown reason 21 on CPU 7.
[68728.725021] Do you have a strange power saving mode enabled?

To trace NMI for unknown reason, I have enabled crash dump using "kdump", set "kernel.unknown_nmi_panic=1" and "kernel.panic_on_unrecovered_nmi=1" in "/etc/sysctl.conf", and let VM run with HeavyLoad from JAM Software (CPU, memory, file writes and disk access). This crash is now reproducible but it could take anywhere from 1 hour to 1 day of running a VM for a crash to occur.

Call trace from crash dump points to "vboxdrv":

[76034.059602] Uhhuh. NMI received for unknown reason 31 on CPU 3.
[76034.059686] Do you have a strange power saving mode enabled?
[76034.059766] Kernel panic - not syncing: NMI: Not continuing
[76034.059846] Pid: 19036, comm: EMT-4 Tainted: G           O 3.2.0-4-amd64 #1 Debian 3.2.65-1+deb7u2
[76034.059937] Call Trace:
[76034.060006]  <NMI>  [<ffffffff8134a53c>] ? panic+0x95/0x1a2
[76034.060157]  [<ffffffff81352056>] ? do_nmi+0x151/0x258
[76034.060235]  [<ffffffff813517a0>] ? nmi+0x20/0x30
[76034.060312]  <<EOE>>  [<ffffffffa03f97c6>] ? rtR0MemAllocEx+0x17e/0x1de [vboxdrv]
[76034.060470]  [<ffffffffa03f05a3>] ? supdrvIOCtlFast+0x75/0x79 [vboxdrv]
[76034.060555]  [<ffffffffa03ed2a9>] ? VBoxDrvLinuxIOCtl_4_3_26+0x43/0x1eb [vboxdrv]
[76034.060645]  [<ffffffff811087dd>] ? do_vfs_ioctl+0x459/0x49a
[76034.060728]  [<ffffffff81039aa2>] ? finish_task_switch+0x4e/0xb9
[76034.060809]  [<ffffffff8134fb09>] ? __schedule+0x5f9/0x610
[76034.060892]  [<ffffffff81108869>] ? sys_ioctl+0x4b/0x72
[76034.060970]  [<ffffffff81355f92>] ? system_call_fastpath+0x16/0x1b

Tested RAM with MemTest86 for days, no problems found. High CPU usage from Interrupts is observed inside the VM as described in Ticket #10611, it might be relevant.

Summary of host specs:

  • BIOS: VT-x and VT-d enabled, HT disabled
  • Motherboard: Intel Server Board S5520HCT
  • CPU: 2 x Intel Xeon E5620, 2.4GHz, 8 cores in total
  • RAM: 12 x 4GB (DDR3 1333MHz ECC Unbuffered)
  • HD: 2 x 600GB WD VelociRaptor 10Krpm

VM configuration:

  • OS: Windows 7 SP1 (with latest updates)
  • vCPU: 4-7 (it seems the higher the number the higher the chances of a crash occurring sooner)
  • vRAM: 8GB

Attached are various logs, crash analyses and hardware info.

Attachments (6)

VBox.log (65.4 KB ) - added by Denis Kozlov 9 years ago.
lshw.txt (46.2 KB ) - added by Denis Kozlov 9 years ago.
Host hardware info
lspci.txt (8.8 KB ) - added by Denis Kozlov 9 years ago.
PCI listing
crash.txt (1.8 KB ) - added by Denis Kozlov 9 years ago.
Crash overview
crash bt.txt (1.8 KB ) - added by Denis Kozlov 9 years ago.
Crash backtrace
crash log.txt (4.3 KB ) - added by Denis Kozlov 9 years ago.
Crash log

Download all attachments as: .zip

Change History (8)

by Denis Kozlov, 9 years ago

Attachment: VBox.log added

by Denis Kozlov, 9 years ago

Attachment: lshw.txt added

Host hardware info

by Denis Kozlov, 9 years ago

Attachment: lspci.txt added

PCI listing

by Denis Kozlov, 9 years ago

Attachment: crash.txt added

Crash overview

by Denis Kozlov, 9 years ago

Attachment: crash bt.txt added

Crash backtrace

by Denis Kozlov, 9 years ago

Attachment: crash log.txt added

Crash log

comment:1 by Denis Kozlov, 9 years ago

Possibly relevant tickets:

  • #10611 - High CPU usage from interrupts (20% on 8 idle processors)
  • #13762 - "Not syncing: An NMI occurred" kernel panic

comment:2 by nj, 8 years ago

Another possibly relevant ticket is 14034

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use