Opened 15 years ago
Closed 14 years ago
#5260 closed defect (fixed)
page allocation failure ... vboxNetFltLinuxPacketHandler
Reported by: | Tobias Oetiker | Owned by: | |
---|---|---|---|
Component: | network | Version: | VirtualBox 3.0.8 |
Keywords: | Cc: | tobi@… | |
Guest type: | Windows | Host type: | Linux |
Description
I am running virtualbox 3.0.8 on linux 2.6.31.4 with bridged ethernet interfaces. I am seeing several "page allocation failure" warnings from the kernel every day. In the Call Trace the vboxNetFltLinuxPacketHandler is showing up all the time. There is some talk about "page allocation failures" on the kernel mailinglist presently, but the bug proofes to be rather elusive. I wonder if my instance could be somehow related to the vboxnetflt driver?
I have put up a few of my traces on http://tobi.oetiker.ch/cluster-2009-10-20-08-31.txt
Attachments (5)
Change History (31)
comment:1 by , 15 years ago
comment:2 by , 15 years ago
I'm running virtualBox 3.1.2 on Linux and experience the same sort of page allocation problems reported from other programs (both our homebrewn ASF.exe, and normal programs like firefox, thunderbird, vino-server) with with vboxNetFltLinuxPacketHandler in the call trace. However, it is with another process (vino.server in my case) and I've quite some TCP load between Linux host and ms-windows guest, and some load from a perl script within ms-windows guest. Next I'll attach my trace.
My ms-windows VM is configured with 384 MB internal memory and without VT-x and nested paging on (but greyed out). According to top, VirtualBox claims 750m virtual (output from top is also in the trace to give an idea of the load in my system).
When a such a page allocation occurs, sometimes multiple in a row, a Linux window (probably one of the reported programs above) freezes for >10 seconds.
by , 15 years ago
Attachment: | vboxNetFltLinuxPacketHandler.crash added |
---|
multiple page allocation failures with vboxNetFltLinuxPacketHandler
comment:3 by , 15 years ago
By the way, I'm running ubuntu 9.04. If any additional info is needed, please let me know.
comment:4 by , 15 years ago
I have updated to ubuntu 9.10, and still experience the same problem, it even became worse. Ubuntu is sometimes almost unusable, also currently I get these page allocation errors every minute or so and multiple applications freeze at each of these.
comment:6 by , 15 years ago
I am currently using version virtualbox 3.1.4. To get a working system again, I just disabled networking and removed the vbox modules vboxnetflt and vboxnetadp. My systems is now much more responsive and I've not yet seen the page allocation error for a few hours, but still experience application freezes.
by , 15 years ago
Attachment: | vbox-3.1.4.dmesg added |
---|
recent dmesg output of multiple page allocation errors in ubuntu 9.10 with vbox-3.1.4
comment:7 by , 15 years ago
we see this problem on one of our systems, it is a 'Dual Core AMD Opteron(tm) Processor 265' with 16 GB Ram. (The cpu has no virtualization capability) ... our errors are 'order:5' ... we are running 2.6.32.8 with vbox 3.1.4.
I have seen mention of other linux network drivers causing similar problems on the LKML it seems to be triggerd by some changes in the way the kernel allocates memory ...
comment:8 by , 15 years ago
Please could you append the configuration of your host kernel to this ticket?
comment:9 by , 15 years ago
Some additional info about my system: I'm running a Dual Intel(R) Core(TM)2 Duo CPU E8400 @3GHz with 2GB internal memory and 6GB swap. I've started the vbox network modules again today after upgrading to vbox-3.1.6, until now it is not misbehaving.
comment:10 by , 15 years ago
It took some time, but the page allocation failures returned yesterday and are getting more frequent. The load on my system has not been very high since I re-started the vbox network modules again for my virtual machine 8 days ago.
So, it looks like this bug is triggered after multiple days of running the vbox network modules and a virtual machine.
comment:11 by , 15 years ago
I have upgraded to 2.6.33.3 ... a day after reboot, the page allocation failures (with vbox 3.1.6) are back ... always order:5 ...
comment:12 by , 15 years ago
I found a workaround ... it seems the problem only ocures in connection with gso enable on a tg3 networkcard ... with
ethtool -K eth0 gso off
the problem goes away. On debian/ubuntu I put the following into /etc/network/if-up.d:
#!/bin/sh ETHTOOL=/usr/sbin/ethtool if [ ! -f $ETHTOOL ]; then exit 0 fi # vbox creates pagefaults when tg3 generic segmentation offloading os on if $ETHTOOL -i "${IFACE}" |grep -q tg3; then echo "turn off gso on $IFACE" $ETHTOOL -K "${IFACE}" gso off fi
comment:13 by , 15 years ago
looking through logs I found the same problem on another box with e1000e driver ...
comment:15 by , 15 years ago
thanks for linking me here...
i am still testing with the gso trick...
i have another card though:
driver: 3c59x version: firmware-version: bus-info: 0000:04:00.0
if the gso trick fixes this one too, ill let you all know.
comment:16 by , 15 years ago
I've just attached a patch for a possible memory leak if GSO is enabled. This fix is for a rarely used error path so it doesn't seem that it will fix your problems but you could try anway. Do the following as root on your host (first make sure that no VM is running):
cd /usr/src/vboxnetflt patch -p0 < ~/diff_vboxnetflt_linux /etc/init.d/vboxdrv setup
Would be interesting to know if this makes any difference for you.
comment:17 by , 15 years ago
Hey, looks like #5675 is also solved by this (the symptoms appear quite similar, and the ethtool trick worked).
Thanks,
Dave
comment:20 by , 15 years ago
hello frank
the gso off seems to have fixed my problem, if it stays without a mem/cpu hog for nother 5 days, ill try to use your patch, and see, if this fixes it for ever...
greetings Oliver
comment:21 by , 15 years ago
Thanks Oliver. Please make sure to enable GSO again when you test the patch.
comment:22 by , 15 years ago
sad news, the patch did not fix the problem.
but since theres a new virtualbox version out today, and a few days ago a new linux kernel got packaged for ubuntu lucid lynx, im giving that combination a try, maybe the bug fixed "itself" somewhere inbetween.
however, i have to agree that gso off fixes the thing.
comment:23 by , 14 years ago
I have absolutely same problem! http://forums.virtualbox.org/viewtopic.php?f=7&t=33023
comment:24 by , 14 years ago
Hi, I'm very interested in this thread, especially for the ethtool gso trick.
But... I'm a little bit confused here. Does the trick applies to the host OS or the guest OS?
The host linux server on which the vboxNetFlt allocmem error occurs is under OpenSuSe 11.2 x86_64 kernel 2.6.31-12. Vbox 3.2.10-109.3.x86_64 running in headless mode. Guest system is Ubuntu 9.10 x86_64 kernel 2.6.31-20. The host system doesn't have much real memory: 2GB, backuped with a 6GB swap space.
The host network card is a bonding of two Broadcom BCM5780 Gigabit. The guest network card is bridged on this bonding with the virtual Intel 1000e driver.
The host OS doesn't do much, but as a backup server, it has an intensive activity at night when rsync script starts. Mostly, this is at this time that the vboxNetFlt allocmem error occurs. Rsynced files are on an XFS filesystem laying over a 3Ware RAID6 volume.
The guest OS, as opposite, has a continuous intensive network activity: access to shared files, file sharing itself and master network service for computing dispatcher.
By the way, I noticed that when the guest OS starts, it does something (wrong?) to the bonding which one of the network card goes into promiscuous mode. As if the virtual card tries absolutely to bind to a physical card.
I first thought that the vboxNetFlt allocmem error occurs because of the particular case of bridged over bonding, but I have another system (pretty much identical) with same network config which doesn't cause any allocmem error.
Thanks for any clue.
we just found the same failures on ubunty jaunty