VirtualBox

Ticket #7385 (reopened defect)

Opened 4 years ago

Last modified 6 months ago

NAT PXE TFTP download fails when using external tftp server -> fixed in svn

Reported by: luizluca Owned by:
Priority: major Component: network/NAT
Version: VirtualBox 3.2.8 Keywords:
Cc: Guest type: other
Host type: Linux

Description (last modified by Hachiman) (diff)

Hello,

I defined a external tftp server in my configuration

VBoxManage modifyvm "Diskless NAT" --nattftpfile1 /pxelinux.0

VBoxManage modifyvm "Diskless NAT" --nattftpserver1 10.9.1.31

When it boot using PXE, it fails to ack after the seconds tftp package is received. I tested with "3.2.6 OSE", compiled from source, and it passed this step.

With "3.2.6 OSE", I have another bug that pxelinux is unable to download its configuration and/or detect that a file is missing. Maybe, it also cannot receive packets but in a latter step.

I'll attach the wireshark traffic for the first case.

Attachments

pxeloaderr.pcap Download (10.9 KB) - added by luizluca 4 years ago.
traffic when pxe tries to download pxelinux.0 from external server
vbox-tftp-ok.pcap Download (17.9 KB) - added by luizluca 4 years ago.
traffic when a linux client inside the same vm downloads pxelinux.0 from external server.
vbox-ose-pxe-problem.pcap Download (30.5 KB) - added by luizluca 4 years ago.
Full net traffic. All tftp server msg, error or not, does not arrive in pxe client
Diskless NAT-2010-10-27-19-57-54.log Download (51.7 KB) - added by luizluca 3 years ago.
VM log
vbox41-pxelinux.0_ok-configs-fail.pcap Download (40.9 KB) - added by luizluca 19 months ago.
Network traffic showing the problem. Pxelinux.0 downloads and runs but the following tftp requests fails
proxy-only.pcap.gz Download (283.7 KB) - added by luizluca 19 months ago.
The traffic when using proxy-only.
default.pcap Download (35.2 KB) - added by luizluca 19 months ago.
NAT with default option capture using nictracefile
proxy-only.pcap Download (299.1 KB) - added by luizluca 19 months ago.
NAT with proxyonly option capture using nictracefile

Change History

Changed 4 years ago by luizluca

traffic when pxe tries to download pxelinux.0 from external server

comment:1 Changed 4 years ago by Hachiman

Does the same happens when you're try to download the same file to linux guest using any tftp client, like atftp?

comment:2 follow-up: ↓ 3 Changed 4 years ago by luizluca

No problem with tftp client.

I used:

tftp mytftpserver -m binary -c get /pxelinux.0

Just the same VM, same tftp server, same file. I used "-m binary" but binary transfer is not necessary in this case.

I'll attach the pcap for this working case. It seems that there is something wrong in BIOS tftp client.

Changed 4 years ago by luizluca

traffic when a linux client inside the same vm downloads pxelinux.0 from external server.

comment:3 in reply to: ↑ 2 Changed 4 years ago by Hachiman

Replying to luizluca:

No problem with tftp client.

thanks for investigation. I've been able reproduce the problem.

comment:4 Changed 4 years ago by Hachiman

I've reproduced it with tftpd-hpa-5.0 server, when I've increased the verbosity i found the following message in the syslog

# grep tftpd: /var/log/syslog
Sep 23 11:23:33 ubuntu in.tftpd[24575]: tftp: client does not accept options

that message is correspond to code in tftpd.c

1505             if (ap_opcode == ERROR) {
1506                 syslog(LOG_WARNING,
1507                        "tftp: client does not accept options\n");
1508                 goto abort;
1509             }

so it looks like outage happens on PXE bootloader.

comment:5 Changed 4 years ago by michaln

The second packet trace is meaningless because the client did not use any options (specifically blocksize), so the behavior is quite different.

The TFTP transfer abort is intentional and not a bug. The real problem seems to be in the NAT layer. Probably unrelated to PXE.

comment:6 Changed 4 years ago by Hachiman

Could you please check r32745 fixes issue for you?

comment:7 Changed 4 years ago by luizluca

Sorry, as I said, I was unable to reproduce it using OSE version and I cannot compile the proprietary version. I'll need to wait for an official release.

However, the problem I mentioned with OSE wasn't fixed with this patch. I'll attach the net traffic to illustrate my problem. It seems that "File not found" msgs does not reach PXE client.

BTW, I'm using OSE compiled with debug flag.

comment:8 follow-up: ↓ 9 Changed 4 years ago by luizluca

I forgot to say that I tested the patch against 3.2.8_OSE and not trunk as I was unable to compile it successfully (have I ever?).

comment:9 in reply to: ↑ 8 Changed 4 years ago by Hachiman

Replying to luizluca:

I forgot to say that I tested the patch against 3.2.8_OSE and not trunk as I was unable to compile it successfully (have I ever?).

Ah, so you need two changesets to apply, r32744 and r32745.

comment:10 Changed 4 years ago by luizluca

I got the same result in OSE (using both patches). This time, I didn't do a full rebuild (kmk clear/kmk) but just a simple kmk. It updated those files:

VirtualBox-3.2.8_OSE/out/linux.amd64/debug/lib/Drivers.a VirtualBox-3.2.8_OSE/out/linux.amd64/debug/bin/VBoxDD.so

Was it enough for the test? I also updated kernel modules

Looking at line number, socket.c changed alot. Maybe there is something more. I don't know PXE deeply inside but I guess the tftp client in use is pxelinux.0 after it is correcly loaded.

It does not receive any server answer after pxelinux.0 is loaded. Just like before the patch was applied.

Changed 4 years ago by luizluca

Full net traffic. All tftp server msg, error or not, does not arrive in pxe client

comment:11 Changed 4 years ago by Hachiman

Could you please upload your testcase (zip with your tftp data)? Please contact me vasily _dot_ levchenko _at_ oracle _dot_ com and I'll provide upload instructions to you.

comment:12 Changed 4 years ago by Hachiman

Using pxelinux.0 you'd sent to me I was able to boot linux from remote tftp server. Could you please re-try with 3.2.10?

comment:13 Changed 3 years ago by luizluca

Sorry, it is still not working. After the first packages (request, option, etc..) it receives the first data package. VM acks it. The server sends the second package but this one is never acked. The server keeps trying to send the second package every 5 seconds but without answer.

Maybe this is related to some problem in vboxnet* and my host machine. I'm using unmodified opensuse11.3 x86_64. Generally, my host machine uses a firewall with masq rules but, for this test, I disabled it.

comment:14 Changed 3 years ago by Hachiman

Could you please attach the log, might be it contains some hints to reproduce the issue.

Changed 3 years ago by luizluca

VM log

comment:15 Changed 3 years ago by Hachiman

Could you please try VBox4.0 b1?

comment:16 follow-up: ↓ 17 Changed 3 years ago by luizluca

I still does not work.

It downloaded the pxelinux.0 file and fails for the config file. It failed to receive any packet about the second file ("Error: file not found" or "Option Ack").

Is there anything I can do to help?

PS: I tried to answer by mail (trac@vi..) and it returned.

comment:17 in reply to: ↑ 16 Changed 3 years ago by Hachiman

Replying to luizluca:

I still does not work.

Does it make any difference whether you select E1k or PCNET adapters? Are you able access from any linux guest this config file from any linux tftp client?

comment:18 Changed 3 years ago by luizluca

I tested with all nic options. Only virtio-net does not load PXE (that is expected). The others, the problem is the same.

I booted this machine using a bridge connection into a rescue linux livecd. I copied tftp client into it. All tests I did with this tftp worked. I hot-switched net to NAT, got a new IP and repeated the tests. All of them worked. I tested the download process in the same command call, in isolated ones, with or without a missing file. Everything worked.

Only PXE tftp client that does not work with NAT.

comment:19 Changed 22 months ago by Hachiman

  • Description modified (diff)
  • Summary changed from NAT PXE TFTP download fails when using external tftp server to NAT PXE TFTP download fails when using external tftp server -> fixed in svn

comment:20 Changed 19 months ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Should be fixed in VBox 4.2.

comment:21 Changed 19 months ago by luizluca

  • Status changed from closed to reopened
  • Resolution fixed deleted

Sorry, this problem still persists on Vbox 4.2.

I tested with all NIC types. It can download pxelinux.0, the pxelinux.0 runs, askes for a file the server answer but the VM never receives it.

When I use the same VM with "bridged mode" in a fake NIC (tap0) and NAT through linux it works. However, as expected, I need to use and tftp modules (nf_nat_tftp, nf_conntrack_tftp).

FYI1: I disabled linux firewall and unloaded tftp modules before any test with Vbox tftp through NAT.

I'll attach the network traffic.

Changed 19 months ago by luizluca

Network traffic showing the problem. Pxelinux.0 downloads and runs but the following tftp requests fails

comment:22 Changed 19 months ago by luizluca

Checking bug #10286, I discovered a workarround for my problem:

VBoxManage modifyvm "Diskless NAT" --nataliasmode1 proxyonly

While:

VBoxManage modifyvm "Diskless NAT" --nataliasmode1 default

Brings the problem back.

comment:23 Changed 19 months ago by Hachiman

Hm I see a lot of attempts to fetch file /pxelinux.cfg/01-08-00-27-6a-b3-f3. Does this file exist? And could you please attach pcap with --nataliasmode1 proxyonly.

Changed 19 months ago by luizluca

The traffic when using proxy-only.

comment:24 Changed 19 months ago by luizluca

Hello Hackiman,

The file does not exists and the server answer that. However, the pxeclient does not seem to receive it. I have already attached the requested pcap.

comment:25 Changed 19 months ago by Hachiman

Hmm, comparing both pcaps I'm bit confused. Have you collected them from host interface or used Network_tips? I'd prefer the last one because I'd like to see communication between NAT and guest.

Last edited 19 months ago by Hachiman (previous) (diff)

comment:26 follow-up: ↓ 27 Changed 19 months ago by luizluca

Hello,

I did now the capture as you requested. What I already saw is that the Vbox NAT engine changes the tftp source addres from 10.9.1.31 to 192.168.8.2 (that might be the internal TFTP server). I don't know why this does not happen with the first file, pxelinux.0. It looks like it only happens with control packages. Well, you'll see by yourself.

Version 0, edited 19 months ago by luizluca (next)

Changed 19 months ago by luizluca

NAT with default option capture using nictracefile

Changed 19 months ago by luizluca

NAT with proxyonly option capture using nictracefile

comment:27 in reply to: ↑ 26 Changed 19 months ago by Hachiman

Replying to luizluca:

Hello,

I did now the capture as you requested. What I already saw is that the Vbox NAT engine changes the tftp source address from 10.9.1.31 to 192.168.8.2 (that might be the internal TFTP server). I don't know why this does not happen with the first file, pxelinux.0. It looks like it only happens with control packages. Well, you'll see by yourself.

Thank you, I'll take a look.

comment:28 Changed 19 months ago by Hachiman

Looks like in default mode dealising right alias not find and NAT drops tftp packet with option code: 5 and error code: 1 - file not found and client tries again request the file, e.g. in frame.number = 53 (default.pcap)

comment:29 Changed 6 months ago by Otheus

I can confirm this bug still persists in version 4.3.2r90405

  • Extpack with PXE ROM support installed
  • Both AM and EM nictypes tried
  • TFTP works as expected but stops on load-request for .cfg file
  • Enabling natproxy as workaround succeeds

I suspect the problem is somehow in the interaction between the pxeboot and NAT module. Packet traces show that the VM is requesting TFTP packets from a 10.x address instead of the configured external TFTP server. It's as if it gets the correct IP address for the initial TFTP request, but not subsequent to that.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use