VirtualBox

Opened 7 years ago

Closed 7 years ago

#16960 closed defect (fixed)

VirtualBox 5.1.26 crashes when using VLAN in linux guest over Internal Network => Fixed in 5.1.28

Reported by: RomanovR Owned by:
Component: network Version: VirtualBox 5.1.26
Keywords: vlan, internal network Cc:
Guest type: Linux Host type: Linux

Description (last modified by Valery Ushakov)

VirtualBox 5.1.26 crashes when using VLAN in linux guest over Internal Network. After 5-10 mins launching VM guest in Host console I hane an error:

NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [EMT-0:19382]

The same bug was in 5.1.24 version and it there are only when using VLAN in guest VM via Internal Network in Host. Portion of logs attached (log in Host)

Attachments (2)

vbox.log (7.3 KB ) - added by RomanovR 7 years ago.
Logs from LInux Host for problem of using VLAN with Ineternal Network
error.png (25.0 KB ) - added by rmaksimov 7 years ago.

Download all attachments as: .zip

Change History (11)

by RomanovR, 7 years ago

Attachment: vbox.log added

Logs from LInux Host for problem of using VLAN with Ineternal Network

comment:1 by Valery Ushakov, 7 years ago

Description: modified (diff)
Summary: VirtualBox 5.1.26 crushing when using VLAN in linux guest over Internal NetworkVirtualBox 5.1.26 crashes when using VLAN in linux guest over Internal Network

comment:2 by Valery Ushakov, 7 years ago

Please, can you describe the scenario in more details? I left two VMs pinging each other over internal network with vlan configured and they happily survived for a couple of hours. Do I need more load or some specific load to trigger this?

Also, I can quite parse the last paragraph. What is the difference between 5.1.24 and 5.1.26 that you are trying to describe there?

comment:3 by toMeloos, 7 years ago

Would like to confirm this issue.

Been using VirtualBox 5.1.8 on Fedora 25 without any problem for months. Then tried VB 5.1.26 on both F25, F26 and the latest CentOS 7 (both with the stock 3.10 and mainline 4.12 kernel) on a HP Z800 and a Dell PowerEdge r610 and we get the CPU soft lockup issue on both. Downgraded back to 5.1.8 and the problem disappears, so we are now happily running that again on CentOS 7 with the stock 3.10 kernel. We have not tested the VB versions between 5.1.8 and 5.1.26.

Our VM's have 4 vCPU and 16 GB ram and run Ubuntu 16.04. They run in a cluster that uses VLANs over VirtualBox Internal Networks for traffic between nodes. We have two separate VirtualBox Internal Networks that both host a few VLANs.

A while after our second VM comes up, the NMI watchdog starts sending error messages to stdout. I'm pretty sure this is around the time puppet on the second machine has configured networking and services and they start generating network traffic. Very soon after, the second VM crashes/gets killed and we're stuck with a defunct VirtualBox process and the watchdog still generating warnings. Only solution is to reboot the host. What's also really weird is that the NMI watchdog warnings show up even after we disabled the NMI watchdog on both the host and the guests.

Last edited 7 years ago by Valery Ushakov (previous) (diff)

comment:4 by Valery Ushakov, 7 years ago

Description: modified (diff)

comment:5 by Valery Ushakov, 7 years ago

Please, provide VM's *.vbox file and the log file.

by rmaksimov, 7 years ago

Attachment: error.png added

comment:6 by rmaksimov, 7 years ago

Confirm this problem.

It seems like there is a bug (???) with Intel PRO/1000 MT Desktop (82540EM), enabled GSO/TSO (by default for this NIC) and configured VLAN. It doesn't matter which "Attached to" type is used (Internal Network or Bridged Adapter).

There is a simple scheme to reproduce this behavior.
VM-1:
Intel PRO/1000 MT Desktop (82540EM)
Ubuntu Server + Wget

ip l a l eth0 name eth0.100 type vlan id 100
ip a a 10.10.10.10/24 dev eth0.100
ip l s eth0.100 up

VM-2:
Intel PRO/1000 MT Desktop (82540EM)
Ubuntu Server + Apache (default page, ~10KiB)

ip l a l eth0 name eth0.100 type vlan id 100
ip a a 10.10.10.20/24 dev eth0.100
ip l s eth0.100 up

Now, if you try to wget 10.10.10.20 from 10.10.10.10, VM-2 will be crashed and the host will be frozen completely a few moments later. Sometimes a window with an error appears (see error.png attachment).

Important things are the following:

  1. Intel PRO/1000 MT Desktop (82540EM) as a NIC on VM-2
  2. Enabled GSO/TSO (by default for this NIC)
  3. Payload size (transferred file size)

Solution:

  1. Just change the network card on VM-2 (the other Intel's cards work perfect, e.g. Intel PRO/1000 T Server 82543GC - GSO/TSO is disabled by default).
  2. Another way is to disable GSO/TSO for eth0.100 on VM-2 with ethtool (ethtool -K eth0.100 gso off).

The problem is presented on VirtualBox 5.1.26 r117224, Windows 7 x64.

Version 1, edited 7 years ago by rmaksimov (previous) (next) (diff)

comment:7 by Aleksey Ilyushin, 7 years ago

This was indeed a regression in 5.1.26 related to segmentation offloading. The fix will be included into the next maintenance release.

comment:8 by Aleksey Ilyushin, 7 years ago

Summary: VirtualBox 5.1.26 crashes when using VLAN in linux guest over Internal NetworkVirtualBox 5.1.26 crashes when using VLAN in linux guest over Internal Network => Fixed in SVN

comment:9 by Michael Thayer, 7 years ago

Resolution: fixed
Status: newclosed
Summary: VirtualBox 5.1.26 crashes when using VLAN in linux guest over Internal Network => Fixed in SVNVirtualBox 5.1.26 crashes when using VLAN in linux guest over Internal Network => Fixed in 5.1.28
Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use