VirtualBox

Opened 8 years ago

Last modified 6 years ago

#16084 reopened defect

ssl connection incorrectly reset when using NAT

Reported by: exg Owned by:
Component: network/NAT Version: VirtualBox 5.0.28
Keywords: Cc:
Guest type: Linux Host type: Mac OS X

Description (last modified by Frank Mehnert)

After upgrading VirtualBox from version 5.0.26 to 5.0.28 on OS X 10.11.6, I noticed that ssl connections created in python with urllib2.urlopen are incorrectly reset on a Debian 8 guest with a single network interface in NAT mode. I attached a minimal python script that almost always fails with the following traceback:

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Failed
Traceback (most recent call last):
  File "./test.py", line 12, in <module>
    shutil.copyfileobj(xact, pipe)
  File "/usr/lib/python2.7/shutil.py", line 49, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python2.7/socket.py", line 380, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/httplib.py", line 602, in read
    s = self.fp.read(amt)
  File "/usr/lib/python2.7/socket.py", line 380, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/ssl.py", line 714, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 608, in read
    v = self._sslobj.read(len or 1024)
socket.error: [Errno 104] Connection reset by peer

I also attached the tcpdump output on both the host and the guest. The issue seems to only occur when the network traffic goes through the host thunderbolt ethernet adapter. If I switch to Wi-Fi, the issue does not occur. Moreover, I am unable to reproduce the problem with curl or wget.

Attachments (5)

test.py (341 bytes ) - added by exg 8 years ago.
testcase
test-guest.log (69.3 KB ) - added by exg 8 years ago.
tcpdump on guest
test-host.log (108.8 KB ) - added by exg 8 years ago.
tcpdump on host
VBox.log (122.6 KB ) - added by exg 8 years ago.
VBox.2.log (119.1 KB ) - added by M.Poil 8 years ago.
VBOX.log - SSL KO on website that use keepalive Off

Download all attachments as: .zip

Change History (49)

by exg, 8 years ago

Attachment: test.py added

testcase

comment:1 by M.Poil, 8 years ago

Hi,

Same bug on 5.1.8 (Have no problem on 5.1.6), on a Windows Host (Linux as Guest)

Best regards

Last edited 8 years ago by M.Poil (previous) (diff)

comment:2 by Frank Mehnert, 8 years ago

Description: modified (diff)

comment:3 by Valery Ushakov, 8 years ago

Please, can you provide actual packet captures, not just the text output from tcpdump?

by exg, 8 years ago

Attachment: test-guest.log added

tcpdump on guest

by exg, 8 years ago

Attachment: test-host.log added

tcpdump on host

comment:4 by exg, 8 years ago

test-{guest,host}.log now contain the packet captures.

comment:5 by Valery Ushakov, 8 years ago

Please, can you try this with a recent test build. Before running the test, please, enable extra logging with

VBoxManage debugvm "..." log --release "+drv_nat.l2"

You should see log lines similar to

NAT: sockerr 0, shuterr 107 - socket 21 (tcp) ...

in your VBox.log. Please, attach that log file.

I've added that instrumentation to both 5.0 and 5.1 test builds, so you can use whichever is convenient for you.

Thanks.

by exg, 8 years ago

Attachment: VBox.log added

comment:6 by exg, 8 years ago

Here is the log file. It contains two sockerr lines, corresponding to two test runs. I used test build 5.0.29-111443.

comment:7 by A_User_Called_M, 8 years ago

same issue on: Host Windows Guest Freebsd

comment:8 by Socratis, 8 years ago

comment:9 by CeDeROM, 8 years ago

The same here :-) Also reported https://www.virtualbox.org/ticket/16126

comment:10 by quarkdoll, 8 years ago

This happens for me as well, macOS Sierra 10.12.1 host (trying both 5.0.28 and 5.1.8), Windows 7 guest.

This was not happening for me with OS X 10.10.5 and 5.0.28.

As well, this happens for me only with a Juno Pulse VPN client connected on the host system and port 80 traffic from the guest system to destinations within the VPN network - not port 80 traffic from the guest system to external web sites (with the VPN up or down.)

Thought my experience has nothing to do with SSL, i'm writing this here because #16103 was closed as a duplicate of this.

in reply to:  description comment:11 by antsbull, 8 years ago

Hi, we use VBox across our company and have found the exact same behaviour with the latest 5.1 and 5.0 releases on Mac OS X 10.11 Hosts with Windows (XP, 7 and 10) guests - all Java SSL connections get reset and end up failing - we have had everyone roll VBox back to 5.0.26 to resolve the issue.

comment:12 by Valery Ushakov, 8 years ago

Please, can you test a recent 5.1 test build (111724 or later)?

comment:13 by Socratis, 8 years ago

So far the tests seem promising. I checked a couple of websites mentioned in the discussion on the forums and they work OK. I'll post in the discussion thread so you'll have more points.

comment:14 by joelpittet, 8 years ago

Same bug on 5.1.8 (Have no problem on 5.1.6), on a OSX Sierra Host (CentOS as Guest)

comment:15 by Socratis, 8 years ago

@joelpittet

Unless you missed it, there is a test build that seems to fix the problem. Since you reported after the test build came out, can you try with the test build, please?

comment:16 by Socratis, 8 years ago

Another reported problem has been fixed with the test build.

Source: Discussion of the bug in the forums.

comment:17 by exg, 8 years ago

vushakov, could you please provide a 5.0 test build? I would prefer to not update to 5.1 at this time.

in reply to:  17 ; comment:18 by Valery Ushakov, 8 years ago

Replying to exg:

vushakov, could you please provide a 5.0 test build? I would prefer to not update to 5.1 at this time.

Please, try 5.0 test builds r111787 or later.

comment:19 by magnetik, 8 years ago

I'm having the same issue, discussed there too : https://forums.virtualbox.org/viewtopic.php?f=3&t=80396&p=377661

Runnig test buid r111724 on Windows 10 host and Ubuntu 16.04 guest.

The VM is using the following network cards : NAT, bridged and private.

I've enabled the nat debug log, and I see theses errors

02:39:27.689427 NAT: Guest address guess 10.0.2.15 re-confirmed by arp request
02:39:42.015342 NAT: sockerr 10058, shuterr 0 - socket 740 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33922 name=192.168.14.180:50347
02:39:43.367597 NAT: sockerr 10058, shuterr 0 - socket 4032 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33926 name=192.168.14.180:50348
02:39:43.455799 NAT: sockerr 10058, shuterr 0 - socket 3772 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33928 name=192.168.14.180:50349
02:39:44.406831 NAT: sockerr 10058, shuterr 0 - socket 2024 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33932 name=192.168.14.180:50350
02:39:44.495414 NAT: sockerr 10058, shuterr 0 - socket 3476 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33934 name=192.168.14.180:50351
02:41:40.679464 NAT: Guest address guess 10.0.2.15 re-confirmed by arp request

VM ifconfig

> % ifconfig
enp0s3    Link encap:Ethernet  HWaddr 02:0d:4d:0a:20:b8
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::d:4dff:fe0a:20b8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:108671 errors:0 dropped:0 overruns:0 frame:0
          TX packets:42261 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:101743905 (101.7 MB)  TX bytes:4264181 (4.2 MB)

enp0s8    Link encap:Ethernet  HWaddr 08:00:27:5c:2e:f1
          inet addr:192.168.14.123  Bcast:192.168.14.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe5c:2ef1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8882 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5460 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1118705 (1.1 MB)  TX bytes:7070265 (7.0 MB)

enp0s9    Link encap:Ethernet  HWaddr 08:00:27:8e:3c:a1
          inet addr:192.168.33.10  Bcast:192.168.33.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe8e:3ca1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:10054 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6828 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7156627 (7.1 MB)  TX bytes:1608819 (1.6 MB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:532 errors:0 dropped:0 overruns:0 frame:0
          TX packets:532 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:53402 (53.4 KB)  TX bytes:53402 (53.4 KB)


To me it's not related to SSL, as I have also issue on non ssl websites.

in reply to:  18 comment:20 by exg, 8 years ago

Replying to vushakov:

Replying to exg:

vushakov, could you please provide a 5.0 test build? I would prefer to not update to 5.1 at this time.

Please, try 5.0 test builds r111787 or later.

Seems to work fine, thank you!

comment:21 by aim, 8 years ago

I have the same issue for these builds:
https://www.virtualbox.org/download/testcase/VirtualBox-5.1.9-111724-Win.exe
https://www.virtualbox.org/download/testcase/Oracle_VM_VirtualBox_Extension_Pack-5.1.9-111724.vbox-extpack

Host is Windows XP connected via ethernet adapter.
Guest is arch linux 64bit
virtualbox build 111724

cormorant[~/tmp]$ /usr/bin/python2 test.py 
appres-1.0.0/
appres-1.0.0/config.guess
appres-1.0.0/missing
appres-1.0.0/NEWS
appres-1.0.0/config.h.in
appres-1.0.0/README
appres-1.0.0/COPYING
appres-1.0.0/Makefile.am
appres-1.0.0/AUTHORS
appres-1.0.0/compile
appres-1.0.0/depcomp
appres-1.0.0/config.sub
appres-1.0.0/ChangeLog
appres-1.0.0/install-sh
appres-1.0.0/appres.c
appres-1.0.0/INSTALL
appres-1.0.0/configure

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Failed
Traceback (most recent call last):
  File "test.py", line 12, in <module>
    shutil.copyfileobj(xact, pipe)
  File "/usr/lib/python2.7/shutil.py", line 49, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python2.7/socket.py", line 384, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/httplib.py", line 612, in read
    s = self.fp.read(amt)
  File "/usr/lib/python2.7/socket.py", line 384, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/ssl.py", line 756, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 643, in read
    v = self._sslobj.read(len)
socket.error: [Errno 104] Connection reset by peer
cormorant[~/tmp]$ uname -a
Linux cormorant.crtdev.local 4.8.6-1-ARCH #1 SMP PREEMPT Mon Oct 31 18:51:30 CET 2016 x86_64 GNU/Linux

The same guest works correctly for this configuration
Host is MAC OS connected via wifi adapter.
Guest is arch linux 64bit
virtualbox build 111374

aim-server[~/tmp]$ /usr/bin/python2 test.py
appres-1.0.0/
appres-1.0.0/config.guess
appres-1.0.0/missing
appres-1.0.0/NEWS
appres-1.0.0/config.h.in
appres-1.0.0/README
appres-1.0.0/COPYING
appres-1.0.0/Makefile.am
appres-1.0.0/AUTHORS
appres-1.0.0/compile
appres-1.0.0/depcomp
appres-1.0.0/config.sub
appres-1.0.0/ChangeLog
appres-1.0.0/install-sh
appres-1.0.0/appres.c
appres-1.0.0/INSTALL
appres-1.0.0/configure
appres-1.0.0/aclocal.m4
appres-1.0.0/appres.man
appres-1.0.0/mkinstalldirs
appres-1.0.0/Makefile.in
appres-1.0.0/configure.ac
aim-server[~/tmp]$ uname -a
Linux aim-server.crtdev.local 4.8.4-1-ARCH #1 SMP PREEMPT Sat Oct 22 18:26:57 CEST 2016 x86_64 GNU/Linux
aim-server[~/tmp]$ 

comment:22 by Valery Ushakov, 8 years ago

For the problem on Windows, please, can you try test builds:

  • 5.1 - r111846
  • 5.0 - r111848

in reply to:  22 comment:23 by aim, 8 years ago

It works now. Thanks!

Replying to vushakov:

For the problem on Windows, please, can you try test builds:

  • 5.1 - r111846
  • 5.0 - r111848

in reply to:  22 comment:24 by magnetik, 8 years ago

Replying to vushakov:

For the problem on Windows, please, can you try test builds:

  • 5.1 - r111846
  • 5.0 - r111848

Works for me too!

comment:25 by DrChaos, 8 years ago

I am also experiencing this problem. 5.1.6 works, 5.1.8 and 5.1.9-111374 fail with intermittent prematurely closed connections, especially during times when many connections are opened quickly, most prominently using Maven in a Java project when it wants to download a number of small files from a Nexus artifact server.

5.1.8 with bridged networking instead of NAT works when connected at the office (wired ethernet), but fails (as expected) when at home with a VPN to corporate servers.

Host: Windows 7. Guest: Scientific Linux (==Centos like) 6.8.

comment:26 by jbarnett, 8 years ago

I was experiencing a problem when provisioning a VirtualBox VM using Vagrant (which actually just calls a Chocolatey install) with VirtualBox 5.1.8. I found this thread on the Chocolatey project where multiple people were reporting the same issue: https://github.com/chocolatey/choco/issues/1029. I tested with 5.1.9 r111846 and it resolved the issue.

I hope to see this fix included with the next VirtualBox release.

comment:27 by Richlv, 8 years ago

can confirm that 5.0.28 (macos host, linux guest) had network issues; "5.0.x revision 111848" works ok

comment:28 by SeanC, 8 years ago

When can we expect a release containing this fix?

comment:29 by M.Poil, 8 years ago

This is a partial fix, when using a webserver without keepalive we still have the problem.

in reply to:  29 comment:30 by Valery Ushakov, 8 years ago

Replying to M.Poil:

This is a partial fix, when using a webserver without keepalive we still have the problem.

Please, can you provide host- and guest-side captures of a failing connection along with VBox.log (see comment:5).

by M.Poil, 8 years ago

Attachment: VBox.2.log added

VBOX.log - SSL KO on website that use keepalive Off

comment:31 by M.Poil, 8 years ago

Note : On my VBOX.log, https target is 89.185.36.50

Best regards,

comment:32 by Valery Ushakov, 8 years ago

Replying to M.Poil:

This is a partial fix, when using a webserver without keepalive we still have the problem.

Your log file says you are using r111724, not r111846 or later as comment:22 tells. Please, can you test the correct test build version.

comment:33 by Zirneklitis, 8 years ago

VirtualBox 5.1.8 r111374 works as expected using Fedora 23 as host and Windows XP as a guest system.

The same virtual image in VirtualBox 5.1.8 r111374 has LAN problems using Windows 10 (v. 1607) as host and Windows XP as a guest system.

comment:34 by Zirneklitis, 8 years ago

In My case (H.: Windows 10 (v. 1607), G.: Windows XP). Upgrading to Version 5.1.9 r111896 (Qt5.6.2) solved the LAN issue.

comment:35 by M.Poil, 8 years ago

Oh sorry, it's working fine with r111846

Best regards

comment:36 by Valery Ushakov, 8 years ago

Please, if you can, give a try to test builds: 5.1 r111957+, or 5.0 r111959+ and report any regressions.

in reply to:  36 comment:37 by mcsplain29, 8 years ago

Replying to vushakov:

Please, if you can, give a try to test builds: 5.1 r111957+, or 5.0 r111959+ and report any regressions.

r111957 fixed the issue for me. Thanks

comment:38 by Frank Mehnert, 8 years ago

Resolution: fixed
Status: newclosed

Fix is part of VBox 5.1.10.

comment:39 by socrates, 8 years ago

Resolution: fixed
Status: closedreopened

The same bug occurred to me in Virtual Box Version 5.1.12 r112440 , on Windows platform

Last edited 8 years ago by socrates (previous) (diff)

in reply to:  39 comment:40 by Valery Ushakov, 8 years ago

Replying to socrates:

The same bug occurred to me in Virtual Box Version 5.1.12 r112440, on Windows platform

Any more details? Like from extra logging from comment:5?

comment:41 by Valery Ushakov, 7 years ago

Resolution: fixed
Status: reopenedclosed

No feedback.

comment:42 by bartmcleod, 6 years ago

Resolution: fixed
Status: closedreopened

comment:43 by bartmcleod, 6 years ago

I reopened the issue, because using NAT networking with version 5.2.6 on Mac OSX is unstable and the network gets reset every so many minutes or connections (not sure which). I experienced this while running a long running Ansible playbook against a Centos 7 guest. The same playbook runs fine when using bridged networking, but we don't want it, because we do not want the VM to be on the company network.

Unfortunately, downgrading to 5.0.26 does not seem to help.

The ssh connection however still exists with version 5.0.26, even when ansible seems to hang. Also, I do not see the reset messages I used to see in the terminal of the Centos guest.

Last edited 6 years ago by bartmcleod (previous) (diff)

comment:44 by Socratis, 6 years ago

@bartmcleod
Did you happen to read about the additional logging required? From comment:5?


enable extra logging with

VBoxManage debugvm "..." log --release "+drv_nat.l2"

You should see log lines similar to

NAT: sockerr 0, shuterr 107 - socket 21 (tcp) ...

in your VBox.log. Please, attach that log file.

Note: See TracTickets for help on using tickets.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette