Ticket #8861 (closed defect: obsolete)
vbox netfilter spamming log - memory allocation failures
|Reported by:||tlhackque||Owned by:|
|Version:||VirtualBox 4.0.6||Keywords:||vboxnetflt page allocation failure|
Description (last modified by frank) (diff)
2P quad-core (8 logical processors) 2 GHz 64-bit VT-enabled server with 8GB, mostly idle, running Fedora Core 14.
Installed first VM - windows server 2008 SP1 (eval kit). Used an .iso image.
VM network is bridged (using the VirtualBox bridge) - a 1Gbe interface to a 1Gbe switch. The OS sees the interface normally (e.g. there's no br* device). There is an extensive iptables filter.
syslog spammed with memory allocation failures, tracebacks and statistics.
Common thread seems to be a call to write() that sends a TCP packet that hits vboxNetFltLinuxPacketHandler [vboxnetflt], which calls skb_copy where alloc_skb fails.
Without debugging - I would guess that alloc_skb should be blocking, not complaining when it can't get memory.
Attached syslog extract is mostly a single event at 08:11:56 yesterday morning. You'll note that it continues into 08:11:57. Then there's almost a 10 sec gap before it starts again.
Log also shows removal of VirtualBox 3.2 - I never actually ran a VM with it, so I can's say if this behavior is new. You can see 4.0.6 being installed.
This did continue intermittently; I stopped the syslog extract when eth0 left promiscuous mode for the first reboot.
Later in the day, I did create another 2 VMs (Fedora); only one has been run.
Attached also are the VBox.log files for the windows server VM. Note that the oldest surviving VBox.log is somewhat later than this event, but that doesn't contain these crashes. I didn't include the other VBox.log since the first failure happened hours before the second VM was defined.
I have also seen tracebacks where CIFSSMBWrite2 is calling kernel_sendmsg, so this doesn't seem limited to user mode write()s.
I suspect that this is the reason that later in the day, other processes got unhappy enough that the entire server went unresponsive and had to be rebooted. (Unfortunately, there are no logs or dumps.)