VirtualBox

Ticket #3783 (closed defect: fixed)

Opened 5 years ago

Last modified 2 years ago

OSX Bridged Networking unreliable due to incorrect MTU setting! => Fixed in SVN

Reported by: slebbon Owned by:
Priority: major Component: network/hostif
Version: VirtualBox 2.2.0 Keywords: MTU Bridged
Cc: Guest type: Windows
Host type: Mac OS X

Description (last modified by aleksey) (diff)

I setup a clean install of XP SP2 on VirtualBox on my new MacPro. I setup the network adapter (Intel Pro 1000 T Server) with Bridged mode. After getting into XP, I was getting an IP address from my router and a few websites would load, but others would not. For example google.com was OK but mozilla.com and microsoft.com would not load. Some experimenting pointed to an MTU issue.

I determined this WAS due to the MTU setting. My testing showed that when using Bridged Mode and performing "ping -f" to my home network router, I could see that setting the ping packet size (-l flag) larger than 1467 caused the packets to timeout, and a size over 1472 to get the expected "Packet needs to be fragmented but DF set." When using an adaptor in NAT mode, this problem does not occur, and pings up to 1472 in size are allowed as expected. Setting the MTU lower in Windows XP registry and rebooting provides a workaround, and allows all websites and pages to load OK again.

It's clear something about using Bridged mode is "padding" the packets with an additional 4 or 5 bits of data...

For reference Windows XP's registry key for MTU size:

System Key: [HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\[Adapter ID]]
Value Name: MTU
Data Type: REG_DWORD (DWORD Value)

Attachments

Windows XP Test-2011-02-24-21-06-04.log Download (64.0 KB) - added by vmann 3 years ago.
VBox.log of WinXP guest, Mac 10.6.6 host, VirtualBox 4.0.4 replicating defect #3783

Change History

comment:1 Changed 5 years ago by DrewMerkle

FWIW, the same thing happens with Ubuntu guest VM using VirtualBox 2.2.2 on a Mac OS X Server host. So far I have tested several flavors of Ubuntu for guest, as well as Windows XP Pro as guest. In my case, a ping of 1469 or larger will not work, but smaller does (consistent between Ubuntu and Windows XP guests). Not sure why it's a different number from the initial report... NAT mode works as expected. Changing the MTU seems to be a successful workaround (thank you thank you thank you!).

I'm using VirtualBox 2.2.2 r46594 on a Mac OS X Server 10.5.6 host.

comment:2 Changed 5 years ago by msully725

I can confirm this same issue, and that changing MTU resolved the issue.
VirtualBox 2.2.2 r46594
Host: Ubuntu 8.10 bridged on eth0
Guest: Win XP SP3 using Intel PRO/1000 MT Desktop

After adding MTU of 1469 to above registry entry I was able to navigate to any site. Before that I could navigate to some sites, but others would never load (I could ping them, but nothing over port 80. telnet to port 80 would never connect). Problem only occurred when bridged on eth0, wlan0 worked with no issues.

comment:3 Changed 5 years ago by msully725

Forgot to mention the host is Ubuntu 8.10 64 bit, guest is 32 bit.

comment:4 Changed 5 years ago by DrewMerkle

For what it's worth, I've now observed similar behaviors with various hosts/guests, as well as with other virtualization software. In particular, I've recently managed to troubleshoot issues using Parallels where the host is Mac OS X 10.5.x and the guest is Windows XP Pro by changing the MTU in the Windows registry. Without knowing more,

I can only speculate, but I wonder if something about the process a guest OS uses to discover a maximum transmission unit (MTU) may be problematic in general that is resolved by setting it manually. I discovered RFC 2923 which is a memo discussing "TCP Problems with Path MTU Discovery" ( http://tools.ietf.org/html/rfc2923); sections of that read exactly like my experiences with this issue.

Hope this is helpful.

comment:5 Changed 5 years ago by IscreaMan

I can confirm this issue with Mac OS X 1.5.7 host, VirtualBox 2.2.4 r47978 and Windows Xp and Ubuntu 9.04 guests. This bug just prevents many sites from loading, as well as other web-based services like Windows Update and apt-get (argghhh!!! ;)

But i am happy that i found why this happens. So many lost hours... Google will rule the world! Even with this MTU bug, it still works :)

comment:6 Changed 5 years ago by slebbon

I wanted to update as well, in my further experience with Windows 7 RC, when using bridged mode, not only does the internet not work, but attempting to use the internet completely freezes windows in the VM requiring the Virtual Machine be hard powered off and restarted. When this happens the VM uses 100% cpu. With NAT mode it runs fine, except I can't assign the VM it's own IP address on my network.

comment:7 Changed 4 years ago by stephenju

I just updated to VirtualBox 3.1.6 r59351 and this problem still exists. Any MTU value above 1468 hangs Windows 7 upon login. After a few minutes it then indicates no network connection.

This is on OS X 10.5.8.

comment:8 Changed 4 years ago by brianm

I'm also having this same problem with a Windows 7 guest on an OS X 10.6.3 host using Virtualbox version 3.1.6

comment:9 Changed 4 years ago by buffyg

I've also seen problems consistently with Windows 7 guests on OS X using bridged networking, which are fixed by the MTU change. I'm also running 10.6.3 (server), but this is also consistent with problems I saw under 10.5.9.

comment:10 Changed 4 years ago by criley

I may have a similar problem.

Yesterday I installed 3.2.8 on my OS X 10.6.4 server and installed XP Professional as my guest. I thought all was well until this morning I noticed that I could not log onto certain sites like Google.com among others from my client machines in the office. The OS X server is providing the DNS for the office machines. I restarted the server and all was fine only because I did not run VB. The guest is set up to run in bridge mode. So I am wondering how to correct this. I do not use the Guest OS for surfing the internet, just to run a few programs that need to talk with the host.

Any help would be greatly appreciated.

comment:11 Changed 3 years ago by buffyg

I've also seen this problem with OpenSolaris and OpenIndiana, where reducing the MTU fixes the problem. There's no record of anyone from VirtualBox reviewing the ticket. Any chance that someone could at least review the issue for reproducibility?

comment:12 Changed 3 years ago by frank

  • Component changed from network to network/hostif

comment:13 Changed 3 years ago by aleksey

Can anybody confirm the problem is still present in 4.0.4 and provide VBox.log?

Changed 3 years ago by vmann

VBox.log of WinXP guest, Mac 10.6.6 host, VirtualBox 4.0.4 replicating defect #3783

comment:14 Changed 3 years ago by frank

Log was attached.

comment:15 Changed 3 years ago by aleksey

vmann,

Using the same host OS (32-bit, 10.6.6), the same VirtualBox version (4.0.4) and Windows XP SP2 as a guest with Intel PRO/1000 T Server adapter I see no symptoms at all: 'ping -f -l 1472 <router_ip>' works properly. Could you check MTU size on the host adapter you bridge to (en0)? If it is 1500, could you enable internal packet capture with:

VBoxManage modifyvm <your-vm> --nictrace<adapter-number> on --nictracefile<adapter-number> file.pcap

then start VM and do a couple of pings with 1472-byte payload? You can mail the resulting file or its tcpdump printout to me at aleksey dot ilyushin at oracle dot com.

comment:16 Changed 3 years ago by buffyg

I did further testing, and it looks like the MTU can't be increased above 1496. The packets appear in the trace, but the OS doesn't get them. I'll e-mail the relevant output to you shortly.

comment:17 Changed 3 years ago by buffyg

Debug was sent about two months ago, and I've not heard anything back. I'll check in with #vbox-dev shortly.

comment:18 Changed 3 years ago by aleksey

Sorry for such delay in communication, I got carried away with other stuff. In the packet capture file you've sent me I see both requests and replies. How did you obtain this file? Using internal capture as described in my comment above or by running tcpdump on the host (guest)?

comment:19 Changed 2 years ago by bhamail

I just got bitten by this bug when I moved WinXP and Win7 VMs from VBox on linux to VBox on MacOSX. Here's the link to the forum thread detailing the issue:  https://forums.virtualbox.org/viewtopic.php?f=8&t=46111&p=208586#p208586

Is there any way to get this bug looked into, and hopefully fixed! The Win Registry workaround is nasty and I imagine a number of people give up before finding that workaround. Ideally, Bridged networking should just work...

Anything I can do to help?

comment:20 Changed 2 years ago by ppinter1

I can't believe I blew most of the day on this. Only by the sheerest fluke did I think to try NAT, which lead me here.

If I had the skill, I'd fix it myself, but I don't so add me to the queue of those eager to see this ancient bug squashed.

comment:21 Changed 2 years ago by karlw

same problem here on VB 4.1.8 r75467 on MC OSX 10.7.3 as host using guests Win xp or Win 7 or Win7 64 bit ( no problems with NAT)

With bridged network Using the same guest vdi win xx guest on a mac book pro 2011 works fine. win xx guest on MacPro 2008 does not work

It works only when

  • pick PCnet-PVI II (Am79C970A) ethernet card. No other ethernet card works

with this card I can ping my router etc, but chrome, IE etc etc do not work properly

  • set MTU to <1500 (1485 works) as described with regedit , then everything works fine

comment:22 Changed 2 years ago by JessePeterson

Problem confirmed here as well. Mac OS X Server 10.6.8 (v1.1) host running VirtualBox 4.1.8 with a Windows XP SP3 guest using bridged networking using a PCnet III adapter. Adjusted MTU to 1468 in registry and networking started working properly.

comment:23 Changed 2 years ago by JessePeterson

Further testing of this problem appears to indicate an issue regardless of the MTU size. Rather it is a problem with sending and receiving packets that are larger than 4 bytes less than the MTU. For example, and under a default MTU of 1500, if I send a packet that is 1497 bytes (a 1469 payload ICMP ping packet) to a VirtualBox host then the packet is reported as an oversized frame. However if one uses a payload size of 1468 (1496 byte packet which is 4 bytes less than the MTU) then it passes the interface without any issue. This case is exactly the same even when using lower than 1500 MTU. It might be characterized like this:

IF PKTSZ > (MTU - 4) THEN drop packet oversize (PKTSZ + 18) ELSE pass packet

The additional 18 bytes I'm not sure off, it relates to NetBSD's reporting by the wm driver (the Intel 1000 adapter). But it is odd that any oversized packet is reported so by 18 bytes larger than the packet.

comment:24 Changed 2 years ago by JessePeterson

More testing on different hardware and host OSes has complicated this issue.

HW SW Ethernet Result
MacBookPro1,1 Mac OS X 10.6.8 Marvell Yukon Gigabit Adapter 88E85053 ? No issue
MacBookPro5,4 Mac OS X 10.7.3 Nvidia MCP79 Ethernet No issue
MacBookPro5,4 Mac OS X 10.7.3 Apple USB Ethernet 10/100 No issue
MacMini1,1 Mac OS X Server 10.6.8 Marvell Yukon Gigabit Adapter 88E85053 No issue
Xserve2,1 Mac OS X Server 10.6.8 Intel 80003ES2LAN Gigabit Ethernet Controller (Copper) Problematic!
Xserve2,1 Mac OS X Server 10.7.3 Intel 80003ES2LAN Gigabit Ethernet Controller (Copper) Problematic!

Lights-Out Management has been disabled on this Xserve's interfaces, and there is no VLAN configurations (ideas to track down the 4-byte off issue). Also tried to track down whether the Ethernet controller chip had any issue. But so far it's just the Xserve that is displaying issues for my.

And one last note is that all of this is (mostly) tested under a NetBSD 5.1.2 guest OS. There is limited testing under Windows XP Pro SP3.

comment:25 Changed 2 years ago by aleksey

  • Description modified (diff)

I've just tried debian-based vyatta guest with PCnet III adapter bridged to Intel 80003ES2LAN Gigabit Ethernet Controller (Copper) on OS X Server 10.6.8 host. I can successfully ping the host with ICMP payload sizes from 1468 to 1472, fragmentation disabled.

comment:26 Changed 2 years ago by aleksey

I was able to reproduce the problem with hardware mentioned above, pinging external host. Thanks a lot, Jesse!

comment:27 Changed 2 years ago by JessePeterson

Fantastic. What was the difference between your initial test and the test that reproduced the problem? Just the hardware? How can I help?

comment:28 Changed 2 years ago by aleksey

It was just the hardware. It turns out that Intel 80003ES2LAN does not strip FCS before passing the packet up the stack. So the packets coming from the wire were rejected by our PCNet as oversized. You have already helped immensely by pinpointing the hardware. You can also help with fix verification. The fix will be included into the next maintenance release.

comment:29 Changed 2 years ago by JessePeterson

Firstly thanks so much for looking at and resolving this issue! Can you link to the source commit? E.g. If I wanted to implement this change now how might I do that? It looks like maintenance releases are released every month or so but 4.1.12 just came out - is there a timeline for the next release? Thanks again!

comment:30 Changed 2 years ago by aleksey

  • Summary changed from OSX Bridged Networking unreliable due to incorrect MTU setting! to OSX Bridged Networking unreliable due to incorrect MTU setting! => Fixed in SVN

comment:31 Changed 2 years ago by aleksey

You can download test build from  http://www.virtualbox.org/download/testcase/VirtualBox-4.1.13-77269-OSX.dmg . If you want to build OSE version yourself here is the link to the revision containing the fix: https://www.virtualbox.org/changeset/40764/vbox .

comment:32 Changed 2 years ago by JessePeterson

Installed 4.1.13-77269 from above link and indeed WinXP Pro SP3 guest using a PCnet-FAST III adapter seems fixed. Along the original report webpages load without issue anymore. The same is true with a NetBSD guest using the PCnet adapter, too. However:

Using the Intel PRO/1000 adapter continues to have the same issues using a NetBSD guest and the above hardware. Perhaps the Intel drivers need to be adjusted as well?

comment:33 Changed 2 years ago by aleksey

Intel PRO/1000 issue should be fixed in  this test build. Could you give it a try?

Last edited 2 years ago by frank (previous) (diff)

comment:34 Changed 2 years ago by JessePeterson

Issue appears to be fixed! I think this is r40799 . Thank you very much!

comment:35 Changed 2 years ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Fix included in VBox 4.1.14.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use