VirtualBox

Opened 8 years ago

Closed 8 years ago

#15412 closed defect (fixed)

Virtualbox 5.0.20 Breaks SSH after VM was Saved and Re-Started if NIC is NAT => Fixed in SVN

Reported by: Hagen Kuehn Owned by:
Component: network/NAT Version: VirtualBox 5.0.20
Keywords: SSH savestate Cc:
Guest type: Linux Host type: Linux

Description

Affected: Virtualbox 5.0.20

Problem: It's not possible to SSH to the guest VM once the VM was saved and re-started from saved state, if the NIC is NAT.

This problem was initially observed while using Vagrant and was reported with https://github.com/mitchellh/vagrant/issues/7306 .

How to reproduce:

Create a vanilla Ubuntu 16.04 Server VM.

  1. Downloaded Ubuntu 16.04 Server from http://www.ubuntu.com/download/server
  2. Install Ubuntu 16.04 Server as new VM
    • Configure VM to have one NIC as NAT
    • Only install SSH Server
  3. Starte VM
  4. Successfully establishe SSH connection

$ ssh -p 2222 ubuntu@localhost

  1. Save the state of the VM by using Virtualbox

$ VBoxManage controlvm Ubuntu-Server-16.04 savestate

  1. Start the VM from the saved state by using Virtualbox

$ VBoxManage startvm Ubuntu-Server-16.04

  1. Observe Failure: SSH to the VM but fails with error: ssh: connect to host localhost port 2222: Connection refused

$ ssh -p 2222 ubuntu@localhost

Attachments (5)

VBox-Started-As-New.log (72.5 KB ) - added by Hagen Kuehn 8 years ago.
Log when VM was started from new
VBox-Started-From-Saved-State.log (68.9 KB ) - added by Hagen Kuehn 8 years ago.
Log when VM was started from saved state
port-forwarding-configuration.png (11.7 KB ) - added by Hagen Kuehn 8 years ago.
Shows port forward configuration
GuestVMConsole-Pressing-Enter-Makes-SSH-Work.png (37.5 KB ) - added by Hagen Kuehn 8 years ago.
SSH works after enter is pressed once on the UI
5.0.21-VBox.log (61.1 KB ) - added by Hagen Kuehn 8 years ago.
VBox.log when tested with dev build Virtualbox 5.0.21

Download all attachments as: .zip

Change History (22)

comment:1 by Frank Mehnert, 8 years ago

Reading the summary it sounds like this was broken with VBox 5.0.20. Did this ever work? If so, which was the last version when it worked for you?

comment:2 by Valery Ushakov, 8 years ago

Please, provide a VBox.log files from VM started anew and from VM resumed from a saved state.

Does the guest use static IP address?

Do you use wildcard destination IP in your port-forwarding rule?

by Hagen Kuehn, 8 years ago

Attachment: VBox-Started-As-New.log added

Log when VM was started from new

by Hagen Kuehn, 8 years ago

Log when VM was started from saved state

comment:3 by Hagen Kuehn, 8 years ago

@frank The workaround is to revert back to Virtualbox 5.0.18 where it works. Please see https://github.com/mitchellh/vagrant/issues/7306 for more details.

@vushakov

  1. Attached as VBox-Started-As-New.log and VBox-Started-From-Saved-State.log
  2. Guest VM has no static IP address
  3. Below is a copy of the only port-forwarding rule in place.

Name Protocol Host IP Host Port Guest IP Guest Port Rule 1 TCP 127.0.0.1 2222 22

by Hagen Kuehn, 8 years ago

Shows port forward configuration

by Hagen Kuehn, 8 years ago

SSH works after enter is pressed once on the UI

comment:4 by Hagen Kuehn, 8 years ago

  1. Since the forwarding rule outlined above is not easily readable, I have attached a screen shot.
  2. Also attached an OVA export of the sample VM. The username and password are "ubuntu".

Interesting finding: While collecting the logs for you I observed something interesting.

I started the VM with "Normal Start"/ in foreground modus and reproduced the error as usual. However I then accidentally pressed "enter" in the VM guest console/ UI and then thought to try the SSH connection command again. Subrisingly the SSH command worked!

Below is a sequence of suggested steps when trying to reproduced.

  1. Import the provided OVA
  2. Start the VM
  3. From the Virtualbox host run the command "ssh -p 2222 ubuntu@localhost" and enter the password "ubuntu". Observe that this works correctly!
  4. Save the state of the VM
  5. Start the VM from the saved state
  6. From the Virtualbox host run the command "ssh -p 2222 ubuntu@localhost". Observe that you encounter the error "ssh_exchange_identification: Connection closed by remote host".
  7. Press "enter" on the VM guest UI/ console once
  8. From the Virtualbox host run the command "ssh -p 2222 ubuntu@localhost" and enter the password "ubuntu". Observe that this works correctly!

comment:5 by Hagen Kuehn, 8 years ago

The OVA upload failed, therefore please use the below Dropbox link to download the OVA. https://www.dropbox.com/s/mx4jgw4xxxd7h58/Ubuntu-Server-16.04.ova?dl=0

comment:6 by Valery Ushakov, 8 years ago

In 5.0.20 the handling of port-forwarding was changed to fix several long-standing bugs (e.g. #13570). That change affects port-forwarding rules with wildcard guest address. NAT needs some sign of life from the guest to guess its address. DHCP or gratuitous ARP. E.g. in the cold boot log you can see

NAT: Guest address guess set to 10.0.2.15 by DHCP ACK

Pressing <Enter> was probably just a coincidence. If you check the log of the resumed VM (not the one you attached, but after ssh works), you will see a similar log line. Since NAT flaps the ethernet link after resume, a DHCP guest will reacquire its DHCP lease and that will tell NAT the IP address. (Note, that we can't use just any packet to infer guest's IP, since the guest may be a router for other VMs, etc, etc).

comment:7 by Hagen Kuehn, 8 years ago

@vushakov Concerning the <Enter>, perhaps it was a coincidence, though I have tested a couple of times with the attached VM. However testing this with my self baked Vagrant boxes, I do not observe the that <Enter> causes the VM to be discoverable.

If I understand you correctly, with Virtualbox 5.0.20 you are clearing the NAT cache and thus Virtualbox needs to receive some sign of life in the shape of a gratuitous ARP or similar request to infer the IP address.

I have now tested Virtualbox 5.0.20 with Ubuntu 14.04 VMs and it works, though it takes almost a minute before SSH becomes available. As far as I know the Ubuntu 14.04 VM I have tested with is vanila too. This could be a reason why not many people started to report it yet.

Based on your feedback I could establish a "workaround" for the Ubuntu 16.04 VM, by configuring a script that fires an ARP request (i.e. arping -c 1 -A eth0 10.0.2.15) every 5 seconds. With this it's possible to establish a SSH connection to the VM after about 40 seconds. Since those VMs on Virtualbox are only for development and CI purposes I don't mind it too much but find this approach nonetheless far from desirable.

Is there a different approach so you can fix your long standing issues but don't break it for Ubuntu 16.04?

Furthermore, as far as I know, correct me if I am wrong, the "Saved State" and the subsequent start of the VM is completely transparent to the VM itself? If this is not the case, I wonder if it would be possible to capture that and run the ARP request only once after the VM was started from the "Saved State"?

comment:8 by Valery Ushakov, 8 years ago

The easiest workaround is to not use wildcard (empty or 0.0.0.0) guest address in the forwarding rules.

As I said earlier, after restoring a VM, we disconnect its ethernet cable for 5 seconds. Normally, DHCP clients reacquire their DHCP lease after that.

comment:9 by Hagen Kuehn, 8 years ago

Yes it works when the guest IP address is specified in the Forwarding Rule. However when using DHCP I cannot reliably know the IP address prior to starting the VM. Therefore I don't think it's an option.

As you said that you are disconnecting the Ethernet cable for 5 seconds after restoring the VM, which should trigger DHClient to new DHCP lease. In the light of this I now start to wonder that Ubuntu's DHClient has changed too and thus this problem is observed.

As of the this Ubuntu DHClient bug I have the feeling there were some changes made too. https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1551351

comment:10 by Hagen Kuehn, 8 years ago

I have now filed an Ubuntu bug concerning the isc-dhclient-4.3.3 package.

Bug: https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1582163

comment:11 by Valery Ushakov, 8 years ago

I guess the link flap might be not long enough as seen in guest's time.

I think the path of least resistance here is to initialize the guess to the default (.15) guest address, since that's what DHCP will hand out by default anyway. If that guess is right (which it is most of the time), port-forwaring will work right away. If it's wrong, we are not worse off, since it wouldn't have worked anyway and will work as soon as there's a good packet to fix the guess.

comment:12 by Hagen Kuehn, 8 years ago

Yes, your suggestion concerning the default guess sounds good to me. It would decrease the chance of encountering a connection issue even for many other potential VM guest operating systems too.

I have the feeling a longer cable disconnect will not make it work either, at least in conjunction with Ubuntu 16.04. I had run a couple of manual test with Ubuntu 16.04 guest OS and had the virtual cable disconnected for about a minute or two but the subsequent reconnect of the cable did not trigger Ubuntu 16.04 to request a new DHCP lease. In the light of this I now believe that it is a Ubuntu 16.04 DHClient bug.

From what I have established so far, this problem encountered here is a result of two independent changes made in Virtualbox and Ubuntu (i.e. Virtualbox 5.0.20 and Ubuntu 16.04 DHClient).

  1. Virtualbox 5.0.20 should function in conjunction with Ubuntu 16.04 guest OS, if Ubuntu 16.04 would fire a DHCP lease request after the cable was disconnected and reconnected. This will hopefully be addressed with https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1582163
  1. Virtualbox 5.0.20 could be enhanced by performing an initial guest IP address guess (.15) as suggested by @vushakov

comment:13 by Valery Ushakov, 8 years ago

Can you give a 5.0.x test build a try?

comment:14 by Frank Mehnert, 8 years ago

If you prefer you could also install this Ubuntu 16.04 test build.

comment:15 by Hagen Kuehn, 8 years ago

I have installed the Ubuntu 16.04 test build and tested in conjunction with Vagrant.

It worked like a charm!

Log says: "NAT: Guest address guess set to 10.0.2.15 by initialization"

I have also attached the "VBox.log" as "5.0.21-VBox.log" to this ticket.

Excellent work!

Last edited 8 years ago by Hagen Kuehn (previous) (diff)

by Hagen Kuehn, 8 years ago

Attachment: 5.0.21-VBox.log added

VBox.log when tested with dev build Virtualbox 5.0.21

comment:16 by Valery Ushakov, 8 years ago

Summary: Virtualbox 5.0.20 Breaks SSH after VM was Saved and Re-Started if NIC is NATVirtualbox 5.0.20 Breaks SSH after VM was Saved and Re-Started if NIC is NAT => Fixed in SVN

comment:17 by Frank Mehnert, 8 years ago

Resolution: fixed
Status: newclosed

Fix is part of VBox 5.0.22.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use