VirtualBox

Ticket #15412 (closed defect: fixed)

Opened 19 months ago

Last modified 17 months ago

Virtualbox 5.0.20 Breaks SSH after VM was Saved and Re-Started if NIC is NAT => Fixed in SVN

Reported by: quater Owned by:
Priority: major Component: network/NAT
Version: VirtualBox 5.0.20 Keywords: SSH savestate
Cc: Guest type: Linux
Host type: Linux

Description

Affected: Virtualbox 5.0.20

Problem: It's not possible to SSH to the guest VM once the VM was saved and re-started from saved state, if the NIC is NAT.

This problem was initially observed while using Vagrant and was reported with  https://github.com/mitchellh/vagrant/issues/7306 .

How to reproduce:

Create a vanilla Ubuntu 16.04 Server VM.

  1. Downloaded Ubuntu 16.04 Server from  http://www.ubuntu.com/download/server
  2. Install Ubuntu 16.04 Server as new VM
    • Configure VM to have one NIC as NAT
    • Only install SSH Server
  3. Starte VM
  4. Successfully establishe SSH connection

$ ssh -p 2222 ubuntu@localhost

  1. Save the state of the VM by using Virtualbox

$ VBoxManage controlvm Ubuntu-Server-16.04 savestate

  1. Start the VM from the saved state by using Virtualbox

$ VBoxManage startvm Ubuntu-Server-16.04

  1. Observe Failure: SSH to the VM but fails with error: ssh: connect to host localhost port 2222: Connection refused

$ ssh -p 2222 ubuntu@localhost

Attachments

VBox-Started-As-New.log Download (72.5 KB) - added by quater 18 months ago.
Log when VM was started from new
VBox-Started-From-Saved-State.log Download (68.9 KB) - added by quater 18 months ago.
Log when VM was started from saved state
port-forwarding-configuration.png Download (11.7 KB) - added by quater 18 months ago.
Shows port forward configuration
GuestVMConsole-Pressing-Enter-Makes-SSH-Work.png Download (37.5 KB) - added by quater 18 months ago.
SSH works after enter is pressed once on the UI
5.0.21-VBox.log Download (61.1 KB) - added by quater 18 months ago.
VBox.log when tested with dev build Virtualbox 5.0.21

Change History

comment:1 Changed 19 months ago by frank

Reading the summary it sounds like this was broken with VBox 5.0.20. Did this ever work? If so, which was the last version when it worked for you?

comment:2 Changed 19 months ago by vushakov

Please, provide a VBox.log files from VM started anew and from VM resumed from a saved state.

Does the guest use static IP address?

Do you use wildcard destination IP in your port-forwarding rule?

Changed 18 months ago by quater

Log when VM was started from new

Changed 18 months ago by quater

Log when VM was started from saved state

comment:3 Changed 18 months ago by quater

@frank The workaround is to revert back to Virtualbox 5.0.18 where it works. Please see  https://github.com/mitchellh/vagrant/issues/7306 for more details.

@vushakov

  1. Attached as VBox-Started-As-New.log and VBox-Started-From-Saved-State.log
  2. Guest VM has no static IP address
  3. Below is a copy of the only port-forwarding rule in place.

Name Protocol Host IP Host Port Guest IP Guest Port Rule 1 TCP 127.0.0.1 2222 22

Changed 18 months ago by quater

Shows port forward configuration

Changed 18 months ago by quater

SSH works after enter is pressed once on the UI

comment:4 Changed 18 months ago by quater

  1. Since the forwarding rule outlined above is not easily readable, I have attached a screen shot.
  2. Also attached an OVA export of the sample VM. The username and password are "ubuntu".

Interesting finding: While collecting the logs for you I observed something interesting.

I started the VM with "Normal Start"/ in foreground modus and reproduced the error as usual. However I then accidentally pressed "enter" in the VM guest console/ UI and then thought to try the SSH connection command again. Subrisingly the SSH command worked!

Below is a sequence of suggested steps when trying to reproduced.

  1. Import the provided OVA
  2. Start the VM
  3. From the Virtualbox host run the command "ssh -p 2222 ubuntu@localhost" and enter the password "ubuntu". Observe that this works correctly!
  4. Save the state of the VM
  5. Start the VM from the saved state
  6. From the Virtualbox host run the command "ssh -p 2222 ubuntu@localhost". Observe that you encounter the error "ssh_exchange_identification: Connection closed by remote host".
  7. Press "enter" on the VM guest UI/ console once
  8. From the Virtualbox host run the command "ssh -p 2222 ubuntu@localhost" and enter the password "ubuntu". Observe that this works correctly!

comment:5 Changed 18 months ago by quater

The OVA upload failed, therefore please use the below Dropbox link to download the OVA.  https://www.dropbox.com/s/mx4jgw4xxxd7h58/Ubuntu-Server-16.04.ova?dl=0

comment:6 Changed 18 months ago by vushakov

In 5.0.20 the handling of port-forwarding was changed to fix several long-standing bugs (e.g. #13570). That change affects port-forwarding rules with wildcard guest address. NAT needs some sign of life from the guest to guess its address. DHCP or gratuitous ARP. E.g. in the cold boot log you can see

NAT: Guest address guess set to 10.0.2.15 by DHCP ACK

Pressing <Enter> was probably just a coincidence. If you check the log of the resumed VM (not the one you attached, but after ssh works), you will see a similar log line. Since NAT flaps the ethernet link after resume, a DHCP guest will reacquire its DHCP lease and that will tell NAT the IP address. (Note, that we can't use just any packet to infer guest's IP, since the guest may be a router for other VMs, etc, etc).

comment:7 Changed 18 months ago by quater

@vushakov Concerning the <Enter>, perhaps it was a coincidence, though I have tested a couple of times with the attached VM. However testing this with my self baked Vagrant boxes, I do not observe the that <Enter> causes the VM to be discoverable.

If I understand you correctly, with Virtualbox 5.0.20 you are clearing the NAT cache and thus Virtualbox needs to receive some sign of life in the shape of a gratuitous ARP or similar request to infer the IP address.

I have now tested Virtualbox 5.0.20 with Ubuntu 14.04 VMs and it works, though it takes almost a minute before SSH becomes available. As far as I know the Ubuntu 14.04 VM I have tested with is vanila too. This could be a reason why not many people started to report it yet.

Based on your feedback I could establish a "workaround" for the Ubuntu 16.04 VM, by configuring a script that fires an ARP request (i.e. arping -c 1 -A eth0 10.0.2.15) every 5 seconds. With this it's possible to establish a SSH connection to the VM after about 40 seconds. Since those VMs on Virtualbox are only for development and CI purposes I don't mind it too much but find this approach nonetheless far from desirable.

Is there a different approach so you can fix your long standing issues but don't break it for Ubuntu 16.04?

Furthermore, as far as I know, correct me if I am wrong, the "Saved State" and the subsequent start of the VM is completely transparent to the VM itself? If this is not the case, I wonder if it would be possible to capture that and run the ARP request only once after the VM was started from the "Saved State"?

comment:8 Changed 18 months ago by vushakov

The easiest workaround is to not use wildcard (empty or 0.0.0.0) guest address in the forwarding rules.

As I said earlier, after restoring a VM, we disconnect its ethernet cable for 5 seconds. Normally, DHCP clients reacquire their DHCP lease after that.

comment:9 Changed 18 months ago by quater

Yes it works when the guest IP address is specified in the Forwarding Rule. However when using DHCP I cannot reliably know the IP address prior to starting the VM. Therefore I don't think it's an option.

As you said that you are disconnecting the Ethernet cable for 5 seconds after restoring the VM, which should trigger DHClient to new DHCP lease. In the light of this I now start to wonder that Ubuntu's DHClient has changed too and thus this problem is observed.

As of the this Ubuntu DHClient bug I have the feeling there were some changes made too.  https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1551351

comment:10 Changed 18 months ago by quater

I have now filed an Ubuntu bug concerning the isc-dhclient-4.3.3 package.

Bug:  https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1582163

comment:11 Changed 18 months ago by vushakov

I guess the link flap might be not long enough as seen in guest's time.

I think the path of least resistance here is to initialize the guess to the default (.15) guest address, since that's what DHCP will hand out by default anyway. If that guess is right (which it is most of the time), port-forwaring will work right away. If it's wrong, we are not worse off, since it wouldn't have worked anyway and will work as soon as there's a good packet to fix the guess.

comment:12 Changed 18 months ago by quater

Yes, your suggestion concerning the default guess sounds good to me. It would decrease the chance of encountering a connection issue even for many other potential VM guest operating systems too.

I have the feeling a longer cable disconnect will not make it work either, at least in conjunction with Ubuntu 16.04. I had run a couple of manual test with Ubuntu 16.04 guest OS and had the virtual cable disconnected for about a minute or two but the subsequent reconnect of the cable did not trigger Ubuntu 16.04 to request a new DHCP lease. In the light of this I now believe that it is a Ubuntu 16.04 DHClient bug.

From what I have established so far, this problem encountered here is a result of two independent changes made in Virtualbox and Ubuntu (i.e. Virtualbox 5.0.20 and Ubuntu 16.04 DHClient).

  1. Virtualbox 5.0.20 should function in conjunction with Ubuntu 16.04 guest OS, if Ubuntu 16.04 would fire a DHCP lease request after the cable was disconnected and reconnected. This will hopefully be addressed with  https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1582163
  1. Virtualbox 5.0.20 could be enhanced by performing an initial guest IP address guess (.15) as suggested by @vushakov

comment:13 Changed 18 months ago by vushakov

Can you give a 5.0.x test build a try?

comment:14 Changed 18 months ago by frank

If you prefer you could also install this Ubuntu 16.04 test build.

comment:15 Changed 18 months ago by quater

I have installed the Ubuntu 16.04 test build and tested in conjunction with Vagrant.

It worked like a charm!

Log says: "NAT: Guest address guess set to 10.0.2.15 by initialization"

I have also attached the "VBox.log" as "5.0.21-VBox.log" to this ticket.

Excellent work!

Last edited 18 months ago by quater (previous) (diff)

Changed 18 months ago by quater

VBox.log when tested with dev build Virtualbox 5.0.21

comment:16 Changed 18 months ago by vushakov

  • Summary changed from Virtualbox 5.0.20 Breaks SSH after VM was Saved and Re-Started if NIC is NAT to Virtualbox 5.0.20 Breaks SSH after VM was Saved and Re-Started if NIC is NAT => Fixed in SVN

comment:17 Changed 17 months ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Fix is part of VBox 5.0.22.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use