Opened 16 years ago
Closed 16 years ago
#2885 closed defect (fixed)
vboxnetflt kernel: Oops: 0000 [1] SMP
Reported by: | Blaine Palmer | Owned by: | |
---|---|---|---|
Component: | network/hostif | Version: | VirtualBox 2.1.0 |
Keywords: | vboxnetflt oops ffffffffffffff58 | Cc: | |
Guest type: | Windows | Host type: | Linux |
Description
The server is running Debian Etch 4, debian 2.6.18-6-amd64 SMP kernel. Processor type is Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz.
Install progressed normally, modules loaded as per lsmod, and no errors until Guest VM is started. Upon starting the Guest VM (WS2k8 Standard) a console alert appears.
Message from syslogd@over9000 at Wed Dec 24 03:34:53 2008 ... over9000 kernel: Oops: 0000 [1] SMP Message from syslogd@over9000 at Wed Dec 24 03:34:53 2008 ... over9000 kernel: CR2: ffffffffffffff58 Message from syslogd@over9000 at Wed Dec 24 03:34:53 2008 ... over9000 kernel: Oops: 0000 [2] SMP Message from syslogd@over9000 at Wed Dec 24 03:34:53 2008 ... over9000 kernel: CR2: ffffffffffffff58
Following this alert general system stability appears unaffected and the VM continues to boot as normal. Once inside the VM it becomes obvious that no networking capability through the Host Interface is working. VM configuration will be attached.
Further investigation into the kernel Oops reveals that this appears to be related to the vboxnetflt module.
---snip--- Dec 24 03:34:53 over9000 kernel: RIP: 0010:[<ffffffff8834772d>] [<ffffffff8834772d>] :vboxnetflt:vboxNetFltLinuxXmitTask+0x1c/0x18e ---snip--- Dec 24 03:34:53 over9000 kernel: <1>Unable to handle kernel paging request at ffffffffffffff58 RIP: ---snip---
The full kernel log error will be attached.
PLEASE NOTE: This error occurs regardless of the guest operating system, or host interface chosen. It occurs as soon as you start a VM with -nic1 hostif via VBoxHeadless (I assume any method of starting the VM but as this machine does not even have a console/monitor there is no way to check X11 or SDL.)
Attachments (5)
Change History (23)
by , 16 years ago
Attachment: | kernel.log added |
---|
comment:1 by , 16 years ago
Forgot to include that there is no difference between the General Linux AMD64 download and the Debian Etch 4 AMD64 download. I have tried uninstalling and re-installing both.
comment:2 by , 16 years ago
The first error seen in the kernel log, ec 24 03:28:53 over9000 kernel: conftest[6062]: segfault at 00002b4e23c9204c rip 00002b4e2358fadb rsp 00007fff87752b70 error 4 This occurs after attempting a reboot following the kernel Oops, it appears that the kernel oops destabilizes the entire system.
follow-up: 4 comment:3 by , 16 years ago
I'm also seeing this OOPS. HOST = Fedora 10 x86_64 2.6.27.7, GUEST = Windows XP SP3
It only occurs when I'm attempting to transfer large files between the HOST and GUEST through a windows share. Web browsing or small files do not cause any OOPS.
After the OOPS occurs, the HOST and GUEST systems becomes unstable. Networking on the GUEST system stops completely. If I try to shut down the GUEST it will sometimes lock up the HOST as well. A hard reset is required.
comment:4 by , 16 years ago
I need to mention that large files are anything above about 2 megs. I do not know an exact size. Also, I am using Host Interface networking on eth0.
This doesn't only happen between host and guest. I attempted to transfer a large file (600 megs) from one computer to the guest computer and the OOPS occurred a few megabytes into the transfer. After that, the guest network connection is broken.
comment:5 by , 16 years ago
I'll also point out that there is a forum post on the debian forums concerning the same problem, http://forums.debian.net/viewtopic.php?p=196943
It seems to be specific to x86_64 systems, if anyone has this issue (same Oops error and issue linked to vboxNetFltLinuxXmitTask) on a non-x86_64 kernel PLEASE COMMENT.
comment:6 by , 16 years ago
I get the same OOPS:
Debian etch running a 2.6.18-6-amd64 kernel on an Intel P4 64-bit CPU @ 2.66GHz (cpu family 15, model 4). Yes, that's a SMP kernel on a single-CPU machine. (Debian sets it up that way, and I'm too busy to worry about it...until now?)
VirtualBox 2.1.0-41146_Debian_etch installed via deb package.
Guest OS: Windows 2000 SP4. Settings: ACPI on, IO APIC off, VT-x/AMD-V unavailable, PAE/NS on, 3D on. (Same results observed on different virtual machine with ACPI off.) Network adapter: PCnet-FAST III(NAT)
NAT networking works fine.
If host networking is enabled, host has kernel OOPS and keyboard ceases to function. Can shut down guest OS via GUI. Shutting down host OS after OOPS (either using GUI, or via timed shutdown process started before the OOPS) hangs on trying to unload vboxnetflt.
Additionally, under NAT networking, VirtualBox/W2000 is very finicky. Shared folders MUST be mapped to a drive letter, or guest machine will freeze up trying to access share. Guest machine will also freeze up when browsing for shared folders in network neighborhood. Note that with a drive letter mapped shared folders work fine (installed Visual Studio from a shared folder).
Once or twice, the guest machine has aborted during heavy network use. (SQL Server Management Studio accessing remote database to which host is connected via PPTP.)
Also: Numerous failures to install W2000 directly from CD. Must make ISO file and mount that for install to succeed (too much latency on CD???).
comment:7 by , 16 years ago
The problem appears with pre-2.6.20 kernels only. The fix will be included into the next maintenance release. Since netflt driver always comes with source code those who urgently need the fix may apply the following patch to src/vboxnetflt/linux/VBoxNetFlt-linux.c (the path is relative to your vbox installation directory):
1031c1031 < INIT_WORK(&pThis->u.s.XmitTask, vboxNetFltLinuxXmitTask, NULL); --- > INIT_WORK(&pThis->u.s.XmitTask, vboxNetFltLinuxXmitTask, &pThis->u.s.XmitTask);
Then, kernel modules need to be re-built with
/etc/init.d/vboxdrv setup
comment:8 by , 16 years ago
aleksey, please read my comment. I'm running kernel 2.6.27 and I'm seeing an OOPS. I don't think this is only pre-2.6.20 kernels.
comment:9 by , 16 years ago
mooninite, right, but your problem is different and my fix won't solve it. In your case the fault happens in skb_put (looks like allocated skb is too small to fit the packet). I will look into this issue as well. What are the MTU sizes on both host and guest btw?
comment:10 by , 16 years ago
The MTU of the host is 1500. The guest is 1480 (windows maximum).
I have even tried switching from the Fast-III adapter to the Intel 1000 MT adapter. Same result.
comment:11 by , 16 years ago
I'm triggering this (the skb_put() panic, my module is patched with fix with the NULL dereference issue) regularly on RHEL-5 (2.6.18-based kernel) running on x86_64 architecture.
I am hitting the problem ~ once a day, have a kdump and debugging symbols, so I could theoretically help with tracing this down (if I were familiar with the code :) In case there's some piece of information I could provide to help debugging this, just let me know.
comment:12 by , 16 years ago
Sure. If you have a core dump, please contact me at frank _dot_ mehnert _at_ sun _dot_ com. Thank you!
by , 16 years ago
Attachment: | VBoxNetFlt-linux.c added |
---|
The latest and greatest version. It hopefully solves both OOPS issues: NULL derefernce and skb_over_panic.
follow-up: 16 comment:13 by , 16 years ago
mooninite, can try the patched version of VBoxNetFlt-linux.c I attached to the ticket? Assuming you've installed a standard package you'll need to copy the patched file to
<vbox_installation_dir>/src/vboxnetflt/linux/
(you may want to backup the original one just in case) and rebuild modules with
sudo /etc/init.d/vboxdrv setup
comment:14 by , 16 years ago
Component: | network → network/hostif |
---|
comment:15 by , 16 years ago
OK, will do. It will be Monday until I can test it on the original problem machine. I took today off.
follow-up: 17 comment:16 by , 16 years ago
Replying to aleksey:
mooninite, can try the patched version of VBoxNetFlt-linux.c I attached to the ticket? Assuming you've installed a standard package you'll need to copy the patched file to
That worked. I transferred a 700 meg file, 70 meg file, and three files totalling 2.1 gigs. It would have never worked on any of those files before your fix. Thanks for the fix.
comment:17 by , 16 years ago
Replying to mooninite:
Replying to aleksey:
mooninite, can try the patched version of VBoxNetFlt-linux.c I attached to the ticket? Assuming you've installed a standard package you'll need to copy the patched file to
That worked. I transferred a 700 meg file, 70 meg file, and three files totalling 2.1 gigs. It would have never worked on any of those files before your fix. Thanks for the fix.
Thanks for trying out the fix! 2.1.2 is out, could you try it and confirm that the problem does not appear anymore?
comment:18 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
The fix is included into 2.1.2. Please re-open the ticket against 2.1.2 if the problem persists.
Kernel Log of the Oops