VirtualBox

Ticket #16084 (reopened defect)

Opened 11 months ago

Last modified 8 months ago

ssl connection incorrectly reset when using NAT

Reported by: exg Owned by:
Priority: major Component: network/NAT
Version: VirtualBox 5.0.28 Keywords:
Cc: Guest type: Linux
Host type: Mac OS X

Description (last modified by frank) (diff)

After upgrading VirtualBox from version 5.0.26 to 5.0.28 on OS X 10.11.6, I noticed that ssl connections created in python with urllib2.urlopen are incorrectly reset on a Debian 8 guest with a single network interface in NAT mode. I attached a minimal python script that almost always fails with the following traceback:

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Failed
Traceback (most recent call last):
  File "./test.py", line 12, in <module>
    shutil.copyfileobj(xact, pipe)
  File "/usr/lib/python2.7/shutil.py", line 49, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python2.7/socket.py", line 380, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/httplib.py", line 602, in read
    s = self.fp.read(amt)
  File "/usr/lib/python2.7/socket.py", line 380, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/ssl.py", line 714, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 608, in read
    v = self._sslobj.read(len or 1024)
socket.error: [Errno 104] Connection reset by peer

I also attached the tcpdump output on both the host and the guest. The issue seems to only occur when the network traffic goes through the host thunderbolt ethernet adapter. If I switch to Wi-Fi, the issue does not occur. Moreover, I am unable to reproduce the problem with curl or wget.

Attachments

test.py Download (341 bytes) - added by exg 11 months ago.
testcase
test-guest.log Download (69.3 KB) - added by exg 11 months ago.
tcpdump on guest
test-host.log Download (108.8 KB) - added by exg 11 months ago.
tcpdump on host
VBox.log Download (122.6 KB) - added by exg 11 months ago.
VBox.2.log Download (119.1 KB) - added by M.Poil 10 months ago.
VBOX.log - SSL KO on website that use keepalive Off

Change History

Changed 11 months ago by exg

testcase

comment:1 Changed 11 months ago by M.Poil

Hi,

Same bug on 5.1.8 (Have no problem on 5.1.6), on a Windows Host (Linux as Guest)

Best regards

Last edited 11 months ago by M.Poil (previous) (diff)

comment:2 Changed 11 months ago by frank

  • Description modified (diff)

comment:3 Changed 11 months ago by vushakov

Please, can you provide actual packet captures, not just the text output from tcpdump?

Changed 11 months ago by exg

tcpdump on guest

Changed 11 months ago by exg

tcpdump on host

comment:4 Changed 11 months ago by exg

test-{guest,host}.log now contain the packet captures.

comment:5 Changed 11 months ago by vushakov

Please, can you try this with a recent test build. Before running the test, please, enable extra logging with

VBoxManage debugvm "..." log --release "+drv_nat.l2"

You should see log lines similar to

NAT: sockerr 0, shuterr 107 - socket 21 (tcp) ...

in your VBox.log. Please, attach that log file.

I've added that instrumentation to both 5.0 and 5.1 test builds, so you can use whichever is convenient for you.

Thanks.

Changed 11 months ago by exg

comment:6 Changed 11 months ago by exg

Here is the log file. It contains two sockerr lines, corresponding to two test runs. I used test build 5.0.29-111443.

comment:7 Changed 11 months ago by A_User_Called_M

same issue on: Host Windows Guest Freebsd

comment:8 Changed 11 months ago by socratis

comment:9 Changed 11 months ago by CeDeROM

The same here :-) Also reported https://www.virtualbox.org/ticket/16126

comment:10 Changed 11 months ago by quarkdoll

This happens for me as well, macOS Sierra 10.12.1 host (trying both 5.0.28 and 5.1.8), Windows 7 guest.

This was not happening for me with OS X 10.10.5 and 5.0.28.

As well, this happens for me only with a Juno Pulse VPN client connected on the host system and port 80 traffic from the guest system to destinations within the VPN network - not port 80 traffic from the guest system to external web sites (with the VPN up or down.)

Thought my experience has nothing to do with SSL, i'm writing this here because #16103 was closed as a duplicate of this.

comment:11 in reply to: ↑ description Changed 11 months ago by antsbull

Hi, we use VBox across our company and have found the exact same behaviour with the latest 5.1 and 5.0 releases on Mac OS X 10.11 Hosts with Windows (XP, 7 and 10) guests - all Java SSL connections get reset and end up failing - we have had everyone roll VBox back to 5.0.26 to resolve the issue.

comment:12 Changed 11 months ago by vushakov

Please, can you test a recent 5.1 test build (111724 or later)?

comment:13 Changed 11 months ago by socratis

So far the tests seem promising. I checked a couple of websites mentioned in the  discussion on the forums and they work OK. I'll post in the discussion thread so you'll have more points.

comment:14 Changed 11 months ago by joelpittet

Same bug on 5.1.8 (Have no problem on 5.1.6), on a OSX Sierra Host (CentOS as Guest)

comment:15 Changed 11 months ago by socratis

@joelpittet

Unless you missed it, there is a test build that seems to fix the problem. Since you reported after the test build came out, can you try with the test build, please?

comment:16 Changed 11 months ago by socratis

Another  reported problem has been  fixed with the test build.

Source:  Discussion of the bug in the forums.

comment:17 follow-up: ↓ 18 Changed 11 months ago by exg

vushakov, could you please provide a 5.0 test build? I would prefer to not update to 5.1 at this time.

comment:18 in reply to: ↑ 17 ; follow-up: ↓ 20 Changed 11 months ago by vushakov

Replying to exg:

vushakov, could you please provide a 5.0 test build? I would prefer to not update to 5.1 at this time.

Please, try 5.0 test builds r111787 or later.

comment:19 Changed 11 months ago by magnetik

I'm having the same issue, discussed there too :  https://forums.virtualbox.org/viewtopic.php?f=3&t=80396&p=377661

Runnig test buid r111724 on Windows 10 host and Ubuntu 16.04 guest.

The VM is using the following network cards : NAT, bridged and private.

I've enabled the nat debug log, and I see theses errors

02:39:27.689427 NAT: Guest address guess 10.0.2.15 re-confirmed by arp request
02:39:42.015342 NAT: sockerr 10058, shuterr 0 - socket 740 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33922 name=192.168.14.180:50347
02:39:43.367597 NAT: sockerr 10058, shuterr 0 - socket 4032 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33926 name=192.168.14.180:50348
02:39:43.455799 NAT: sockerr 10058, shuterr 0 - socket 3772 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33928 name=192.168.14.180:50349
02:39:44.406831 NAT: sockerr 10058, shuterr 0 - socket 2024 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33932 name=192.168.14.180:50350
02:39:44.495414 NAT: sockerr 10058, shuterr 0 - socket 3476 (tcp) exp. in 0 state=SS_ISFCONNECTED|SS_FCANTRCVMORE fUnderPolling f_(addr:port)=87.98.253.214:443 l_(addr:port)=10.0.2.15:33934 name=192.168.14.180:50351
02:41:40.679464 NAT: Guest address guess 10.0.2.15 re-confirmed by arp request

VM ifconfig

> % ifconfig
enp0s3    Link encap:Ethernet  HWaddr 02:0d:4d:0a:20:b8
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::d:4dff:fe0a:20b8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:108671 errors:0 dropped:0 overruns:0 frame:0
          TX packets:42261 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:101743905 (101.7 MB)  TX bytes:4264181 (4.2 MB)

enp0s8    Link encap:Ethernet  HWaddr 08:00:27:5c:2e:f1
          inet addr:192.168.14.123  Bcast:192.168.14.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe5c:2ef1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8882 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5460 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1118705 (1.1 MB)  TX bytes:7070265 (7.0 MB)

enp0s9    Link encap:Ethernet  HWaddr 08:00:27:8e:3c:a1
          inet addr:192.168.33.10  Bcast:192.168.33.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe8e:3ca1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:10054 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6828 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7156627 (7.1 MB)  TX bytes:1608819 (1.6 MB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:532 errors:0 dropped:0 overruns:0 frame:0
          TX packets:532 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:53402 (53.4 KB)  TX bytes:53402 (53.4 KB)


To me it's not related to SSL, as I have also issue on non ssl websites.

comment:20 in reply to: ↑ 18 Changed 11 months ago by exg

Replying to vushakov:

Replying to exg:

vushakov, could you please provide a 5.0 test build? I would prefer to not update to 5.1 at this time.

Please, try 5.0 test builds r111787 or later.

Seems to work fine, thank you!

comment:21 Changed 11 months ago by aim

I have the same issue for these builds:
https://www.virtualbox.org/download/testcase/VirtualBox-5.1.9-111724-Win.exe
https://www.virtualbox.org/download/testcase/Oracle_VM_VirtualBox_Extension_Pack-5.1.9-111724.vbox-extpack

Host is Windows XP connected via ethernet adapter.
Guest is arch linux 64bit
virtualbox build 111724

cormorant[~/tmp]$ /usr/bin/python2 test.py 
appres-1.0.0/
appres-1.0.0/config.guess
appres-1.0.0/missing
appres-1.0.0/NEWS
appres-1.0.0/config.h.in
appres-1.0.0/README
appres-1.0.0/COPYING
appres-1.0.0/Makefile.am
appres-1.0.0/AUTHORS
appres-1.0.0/compile
appres-1.0.0/depcomp
appres-1.0.0/config.sub
appres-1.0.0/ChangeLog
appres-1.0.0/install-sh
appres-1.0.0/appres.c
appres-1.0.0/INSTALL
appres-1.0.0/configure

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Failed
Traceback (most recent call last):
  File "test.py", line 12, in <module>
    shutil.copyfileobj(xact, pipe)
  File "/usr/lib/python2.7/shutil.py", line 49, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python2.7/socket.py", line 384, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/httplib.py", line 612, in read
    s = self.fp.read(amt)
  File "/usr/lib/python2.7/socket.py", line 384, in read
    data = self._sock.recv(left)
  File "/usr/lib/python2.7/ssl.py", line 756, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 643, in read
    v = self._sslobj.read(len)
socket.error: [Errno 104] Connection reset by peer
cormorant[~/tmp]$ uname -a
Linux cormorant.crtdev.local 4.8.6-1-ARCH #1 SMP PREEMPT Mon Oct 31 18:51:30 CET 2016 x86_64 GNU/Linux

The same guest works correctly for this configuration
Host is MAC OS connected via wifi adapter.
Guest is arch linux 64bit
virtualbox build 111374

aim-server[~/tmp]$ /usr/bin/python2 test.py
appres-1.0.0/
appres-1.0.0/config.guess
appres-1.0.0/missing
appres-1.0.0/NEWS
appres-1.0.0/config.h.in
appres-1.0.0/README
appres-1.0.0/COPYING
appres-1.0.0/Makefile.am
appres-1.0.0/AUTHORS
appres-1.0.0/compile
appres-1.0.0/depcomp
appres-1.0.0/config.sub
appres-1.0.0/ChangeLog
appres-1.0.0/install-sh
appres-1.0.0/appres.c
appres-1.0.0/INSTALL
appres-1.0.0/configure
appres-1.0.0/aclocal.m4
appres-1.0.0/appres.man
appres-1.0.0/mkinstalldirs
appres-1.0.0/Makefile.in
appres-1.0.0/configure.ac
aim-server[~/tmp]$ uname -a
Linux aim-server.crtdev.local 4.8.4-1-ARCH #1 SMP PREEMPT Sat Oct 22 18:26:57 CEST 2016 x86_64 GNU/Linux
aim-server[~/tmp]$ 

comment:22 follow-ups: ↓ 23 ↓ 24 Changed 11 months ago by vushakov

For the problem on Windows, please, can you try test builds:

  • 5.1 - r111846
  • 5.0 - r111848

comment:23 in reply to: ↑ 22 Changed 11 months ago by aim

It works now. Thanks!

Replying to vushakov:

For the problem on Windows, please, can you try test builds:

  • 5.1 - r111846
  • 5.0 - r111848

comment:24 in reply to: ↑ 22 Changed 11 months ago by magnetik

Replying to vushakov:

For the problem on Windows, please, can you try test builds:

  • 5.1 - r111846
  • 5.0 - r111848

Works for me too!

comment:25 Changed 11 months ago by DrChaos

I am also experiencing this problem. 5.1.6 works, 5.1.8 and 5.1.9-111374 fail with intermittent prematurely closed connections, especially during times when many connections are opened quickly, most prominently using Maven in a Java project when it wants to download a number of small files from a Nexus artifact server.

5.1.8 with bridged networking instead of NAT works when connected at the office (wired ethernet), but fails (as expected) when at home with a VPN to corporate servers.

Host: Windows 7. Guest: Scientific Linux (==Centos like) 6.8.

comment:26 Changed 11 months ago by jbarnett

I was experiencing a problem when provisioning a VirtualBox VM using Vagrant (which actually just calls a Chocolatey install) with VirtualBox 5.1.8. I found this thread on the Chocolatey project where multiple people were reporting the same issue:  https://github.com/chocolatey/choco/issues/1029. I tested with 5.1.9 r111846 and it resolved the issue.

I hope to see this fix included with the next VirtualBox release.

comment:27 Changed 11 months ago by Richlv

can confirm that 5.0.28 (macos host, linux guest) had network issues; "5.0.x revision 111848" works ok

comment:28 Changed 10 months ago by SeanC

When can we expect a release containing this fix?

comment:29 follow-up: ↓ 30 Changed 10 months ago by M.Poil

This is a partial fix, when using a webserver without keepalive we still have the problem.

comment:30 in reply to: ↑ 29 Changed 10 months ago by vushakov

Replying to M.Poil:

This is a partial fix, when using a webserver without keepalive we still have the problem.

Please, can you provide host- and guest-side captures of a failing connection along with VBox.log (see comment:5).

Changed 10 months ago by M.Poil

VBOX.log - SSL KO on website that use keepalive Off

comment:31 Changed 10 months ago by M.Poil

Note : On my VBOX.log, https target is 89.185.36.50

Best regards,

comment:32 Changed 10 months ago by vushakov

Replying to M.Poil:

This is a partial fix, when using a webserver without keepalive we still have the problem.

Your log file says you are using r111724, not r111846 or later as comment:22 tells. Please, can you test the correct test build version.

comment:33 Changed 10 months ago by Zirneklitis

VirtualBox 5.1.8 r111374 works as expected using Fedora 23 as host and Windows XP as a guest system.

The same virtual image in VirtualBox 5.1.8 r111374 has LAN problems using Windows 10 (v. 1607) as host and Windows XP as a guest system.

comment:34 Changed 10 months ago by Zirneklitis

In My case (H.: Windows 10 (v. 1607), G.: Windows XP). Upgrading to Version 5.1.9 r111896 (Qt5.6.2) solved the LAN issue.

comment:35 Changed 10 months ago by M.Poil

Oh sorry, it's working fine with r111846

Best regards

comment:36 follow-up: ↓ 37 Changed 10 months ago by vushakov

Please, if you can, give a try to test builds: 5.1 r111957+, or 5.0 r111959+ and report any regressions.

comment:37 in reply to: ↑ 36 Changed 10 months ago by mcsplain29

Replying to vushakov:

Please, if you can, give a try to test builds: 5.1 r111957+, or 5.0 r111959+ and report any regressions.

r111957 fixed the issue for me. Thanks

comment:38 Changed 10 months ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Fix is part of VBox 5.1.10.

comment:39 follow-up: ↓ 40 Changed 9 months ago by socrates

  • Status changed from closed to reopened
  • Resolution fixed deleted

The same bug occurred to me in Virtual Box Version 5.1.12 r112440 , on Windows platform

Last edited 9 months ago by socrates (previous) (diff)

comment:40 in reply to: ↑ 39 Changed 8 months ago by vushakov

Replying to socrates:

The same bug occurred to me in Virtual Box Version 5.1.12 r112440, on Windows platform

Any more details? Like from extra logging from comment:5?

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use