VirtualBox

Ticket #19313 (closed defect: obsolete)

Opened 13 months ago

Last modified 5 weeks ago

random network dropouts

Reported by: SkipMMT Owned by:
Component: network Version: VirtualBox 6.1.2
Keywords: dropouts Cc:
Guest type: Linux Host type: Linux

Description (last modified by fbatschu) (diff)

Since upgrading the linux host to VirtualBox-6.1.2, the linux guest has been experiencing random network dropouts, 1 to 10 times a day, lasting for a few seconds. At these times, the linux guest reports:

Feb 13 20:04:59 kernel: e1000 0000:00:03.0 eth0: Reset adapter
Feb 13 20:05:01 kernel: e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

After downgrading to VirtualBox-6.0.16, the dropouts and the messages disappear.

Attachments

VBox.log Download (151.8 KB) - added by SkipMMT 13 months ago.

Change History

Changed 13 months ago by SkipMMT

comment:1 Changed 8 months ago by SkipMMT

Has there been any progress on fixing this regression in version 6.1? It is still present in 6.1.10. This is a show stopper for me. I have applications that fail due to this regression. I can't stay at version 6.0 because 6.0 can't be installed on fedora 32, and I must upgrade to fedora 32.

comment:2 Changed 7 months ago by fbatschu

  • Description modified (diff)

comment:3 Changed 7 months ago by fbatschu

  • Status changed from new to awaitsfeedback

one possible explanation for the lack of interest could be that there isn't much information in it to work on this ticket, ie. lack of basic information about guest and host OS, guest configuration, the offending VMs log file, specifically the networking configuration, no information whatsoever how to possibly reproduce this in house to start an evaluation of the problem. basically your ticket is entirely content free aside that one observation you made from a message in a guests log file. So you have found the link to the public bugtracker for submitting bugs, yet you have apparently not read the preamble on that same page: https://www.virtualbox.org/wiki/Bugtracker

comment:4 Changed 7 months ago by SkipMMT

I did read the preamble, and provided the requested information in the ticket submission form. I was expecting an acknowledgment that someone had seen the ticket and needed more information and would let me know specifically what that information might be.

I know of no way to trigger this failure so that you can reproduce it. Like I said it is random. This happens on four different hosts and two different guests, connected to two different Cisco switches on different networks. The four hosts have similar hardware, and they all use the Intel 82574L network interface hardware. These hosts and guests have run fedora 30, 31, and 32 with all the kernels released for them, which has made no difference in the problem. Obviously, the guests are configured to use the e1000 network interface. At the time of the failure, the host message log has no relevant information, and the guest message log has only those two lines I already posted.

I realize that intermittent problems are the worst to troubleshoot. I haven't found anything that correlates to the failures, except that it happens on every virtualbox 6.1.x, and never on any virtualbox 6.0.x. It does seem to be more frequent on the guest with more network traffic. I know that there is not much information to go on, but that's the way it is. The only approach that I can think of to attack the problem is to inspect the virtualbox network code changes between 6.0 and 6.1 and see what might produce these random outages. Please let me know if there is anything else I can do to help.

comment:5 Changed 7 months ago by RoNiN

Hello,

I can confirm the behaviour described by SkipMMT. It began by upgrading to 6.1 series. My guest os is an up-to-date CentOS 7 with a single bridged adapter to host OS.

My host OS kernel is the following:

5.4.0-sabayon #1 SMP Sun Jul 12 21:09:29 UTC 2020 x86_64 Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz GenuineIntel GNU/Linux

I was using VirtIO when problems arose, I tried to switch ethernet adapter drivers to intel's. e1000 driver was intelligent enough to reset itself when it detects a jam. However, it still takes time for e1000 to do so.

The frequency of the jams gets higher with the uptime getting higher. If I reboot the host OS, for a week or so, things go normal, then jams start again. And its frequency also gets higher after that. I followed it from once in 3 days, to twice in a day. Then rebooted the host OS. After 8 days, it happened again. Frustrating.. :/

I can bypass the jam by doing ifconfig down/up from both the host OS and also the guest OS. Then things go normal for a while. The host OS's secondary ethernet interface is a Realtek ethernet card dedicated to this VM, driven by r8169 module. It even does not have an IP on the Host OS. A problem that is beyond our reach is probably plaguing virtualbox kernel modules.

This system was made in a haste until I can make a new corporate VM infrastructure. It is under constant communication traffic. And this ethernet jamming problem is a bummer for me..

I am not a programmer, however, If you can guide me about how to collect more descriptive data, I'll try my best.

Thank you.

Last edited 7 months ago by RoNiN (previous) (diff)

comment:6 Changed 5 months ago by RoNiN

Hello Again,

As of today, meaning 30 days of stability, I think "Version 6.1.6 r137129 (Qt5.6.1)" solves this problem. As a side note, I did not do a kernel upgrade or a firmware pack upgrade in my system. I solely upgraded Virtualbox itself and its kernel modules.

Hope this helps someone out there..

All the best..

comment:7 Changed 4 months ago by janitor

  • Status changed from awaitsfeedback to closed
  • Resolution set to obsolete

Thanks for the update.

comment:8 Changed 5 weeks ago by Sanitariu

Problem still exists. Using version 6.1..14. All suggestions tested with gro off tso off , no vbox additions etc still resets.

[66169.615963] ------------[ cut here ]------------
[66169.615978] WARNING: CPU: 0 PID: 0 at /build/linux-o3gOgM/linux-4.9.189/net/sched/sch_generic.c:316 dev_watchdog+0x233/0x240
[66169.615980] NETDEV WATCHDOG: enp0s3 (e1000): transmit queue 0 timed out
[66169.615981] Modules linked in: binfmt_misc vboxvideo(O) ipt_REJECT nf_reject_ipv4 xt_multiport xt_tcpudp ip6table_filter ip6_tables iptable_filter sb_edac edac_core iTCO_wdt intel_powerclamp kvm_intel iTCO_vendor_support kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel evdev sg intel_rapl_perf lpc_ich serio_raw vmwgfx mfd_core pcspkr rng_core vboxguest(O) ttm drm_kms_helper drm video button ac ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sd_mod ata_generic crc32c_intel ata_piix ahci libahci aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd psmouse e1000 libata scsi_mod i2c_piix4
[66169.616064] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O    4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u2
[66169.616066] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[66169.616069]  0000000000000000 ffffffffa4136404 ffff9550e3c03e20 0000000000000000
[66169.616075]  ffffffffa3e7b83b 0000000000000000 ffff9550e3c03e78 ffff9550ccd1a000
[66169.616080]  0000000000000000 ffff9550cd242a80 0000000000000001 ffffffffa3e7b8bf
[66169.616085] Call Trace:
[66169.616088]  <IRQ>
[66169.616097]  [<ffffffffa4136404>] ? dump_stack+0x5c/0x78
[66169.616102]  [<ffffffffa3e7b83b>] ? __warn+0xcb/0xf0
[66169.616106]  [<ffffffffa3e7b8bf>] ? warn_slowpath_fmt+0x5f/0x80
[66169.616112]  [<ffffffffa433eb93>] ? dev_watchdog+0x233/0x240
[66169.616117]  [<ffffffffa433e960>] ? dev_deactivate_queue.constprop.26+0x60/0x60
[66169.616122]  [<ffffffffa3eea292>] ? call_timer_fn+0x32/0x120
[66169.616126]  [<ffffffffa3eea607>] ? run_timer_softirq+0x1d7/0x430
[66169.616132]  [<ffffffffa413f564>] ? timerqueue_add+0x54/0xa0
[66169.616136]  [<ffffffffa3eec2f8>] ? enqueue_hrtimer+0x38/0x80
[66169.616141]  [<ffffffffa44220ad>] ? __do_softirq+0x10d/0x2b0
[66169.616147]  [<ffffffffa3e81e52>] ? irq_exit+0xc2/0xd0
[66169.616150]  [<ffffffffa4421b2c>] ? smp_apic_timer_interrupt+0x4c/0x60
[66169.616156]  [<ffffffffa442025e>] ? apic_timer_interrupt+0x9e/0xb0
[66169.616158]  <EOI>
[66169.616162]  [<ffffffffa441da92>] ? mwait_idle+0x72/0x160
[66169.616171]  [<ffffffffa3ebf33a>] ? cpu_startup_entry+0x1ca/0x240
[66169.616180]  [<ffffffffa4b3ef5e>] ? start_kernel+0x447/0x467
[66169.616185]  [<ffffffffa4b3e120>] ? early_idt_handler_array+0x120/0x120
[66169.616188]  [<ffffffffa4b3e408>] ? x86_64_start_kernel+0x14c/0x170
[66169.616191] ---[ end trace b2398e43d8835b28 ]---
[66169.616224] e1000 0000:00:03.0 enp0s3: Reset adapter
[66171.728617] e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[68665.596236] e1000 0000:00:03.0 enp0s3: Reset adapter
[68667.676228] e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Last edited 4 weeks ago by janitor (previous) (diff)
Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use