VirtualBox

Opened 15 years ago

Closed 8 years ago

#3619 closed defect (obsolete)

PXE problems with multiple DHCP servers

Reported by: Tobias Evert Owned by:
Component: network Version: VirtualBox 2.1.4
Keywords: PXE Cc:
Guest type: Linux Host type: Linux

Description (last modified by Valery Ushakov)

We've recently been having problems PXE booting VMs using VB 2.1.4. We're simulating a cluster by running many VMs connected through Internal Networks. The network has two servers which facilitate DHCP and PXE. When booting a VM through PXE there is a high probability (50%) that the boot fails.

The network layout:
Server 1 IP: 192.168.236.1
Server 2 IP: 192.168.236.2
Client N IP: 192.168.236.N + 2

The PXE output when the boot fails for Client 1:


Searching for server (DHCP)......
Me: 192.168.236.3, DHCP: 192.168.236.2
Loading 192.168.236.2:nodes/3/boot/boot.0 ...(PXE).......done

PXELINUX 3.11 0x4639e5ce Copyright (C) 1994-2005 H. Peter Anvin
UNDI data segment at: 0009E000
UNDI data segment size: 1000
UNDI code segment at: 0009F000
UNDI code segment size: 0B1D
PXE entry point found (we hope) at 9F00:0680
My IP address seems to be C0A8EC03 192.168.236.3
ip=192.168.236.3:192.168.236.1:0.0.0.0:255:255:255:0
TFTP prefix: nodes/3/boot/
Trying to load: pxelinux.conf
192.168.236.1 is not in my arp table!
192.168.236.1 is not in my arp table!
192.168.236.1 is not in my arp table!
192.168.236.1 is not in my arp table!
<And so on>


From what I gather the problem seems to start on the ip-line:
"ip=192.168.236.3:192.168.236.1:0.0.0.0:255:255:255:0"
It shouldn't even know about the 192.168.236.1 server, since it got it's address from the other server. On the occasions that the boot actually works the ip-line looks like:
"ip=192.168.236.3:192.168.236.2:0.0.0.0:255:255:255:0"

Having done packet analysis during the boot-up stage I see that Client 1 sends out a DHCP request (twice), then receiving replies from both servers (2 first), it selects the address it received from the first reply (actually, it gets the same address from both servers), advertises it's choice and gets ACKs back from both Servers.

Maybe the problem for Etherboot is that the replies are with the exact same IP addresses, and same Transaction ID. I'm guessing there has to be something special, since multi-DHCP server environments isn't exactly uncommon.

I see that the version of Etherboot in VB 2.1.4 isn't the very newest. Maybe there is a fix for this in a newer version. Are there plans to go over to gPXE? I know that it's probably non-trivial, and that you have some local patches against Etherboot in your source, but apart from maintenance, development on Etherboot has stopped, so going over to gPXE has to be done some time.

Attached is two dumps in tcpdump format, one when the boot worked, and one where it failed.

Attachments (3)

failed.tcpdump (18.4 KB ) - added by Tobias Evert 15 years ago.
A failed boot
working.tcpdump (18.6 KB ) - added by Tobias Evert 15 years ago.
A working boot
Virtual LaboPC.png (11.1 KB ) - added by jonsy 11 years ago.
Capture of failed pxe boot with multiple DHCP servers

Download all attachments as: .zip

Change History (7)

by Tobias Evert, 15 years ago

Attachment: failed.tcpdump added

A failed boot

by Tobias Evert, 15 years ago

Attachment: working.tcpdump added

A working boot

comment:1 by Frank Mehnert, 15 years ago

priority: blockermajor
Resolution: fixed
Status: newclosed

Please reopen if this bug is still relevant. There were a lot of network-related changes in the meantime. Make sure to test the latest version, VBox 3.0.8.

comment:2 by jonsy, 11 years ago

Resolution: fixed
Status: closedreopened

Same problem re-appeared in virtualbox in 4.1.12_Ubuntu r77245 on ubuntu 12.04 amd64 (VM bridged to an interface attached to a network with multiple dhcp servers )

When configuring dhcp servers to make only one respond to vm client everything goes fine, but when multi-dhcp-server environment is restored, vm client fails randomly with same message shown "xxxx is not in my arp table"

Linux jonsy 3.2.0-41-generic #66-Ubuntu SMP Thu Apr 25 03:27:11 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

comment:3 by jonsy, 11 years ago

Same problem with latest version from vb page

virtualbox-4.2_4.2.12-84980~Ubuntu~precise_amd64.deb attached screen capture

by jonsy, 11 years ago

Attachment: Virtual LaboPC.png added

Capture of failed pxe boot with multiple DHCP servers

comment:4 by Valery Ushakov, 8 years ago

Description: modified (diff)
Resolution: obsolete
Status: reopenedclosed

VirtualBox uses iPXE now. I've verified that the scenario with two DHCP servers (with the same static assignments for the client) works as expected.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use