Ticket #8891 (closed defect: fixed)
Reproducible Fedora 6 SMP guest crashes
|Reported by:||tlhackque||Owned by:|
I virtualized a server that has been running on dedicated hardware for several years by cloning the original disk to a disk image on a larger server. Minimal configuration changes since, so "before" and "after" statements are attributable to virtualization.
The original server has 2 CPUs according to linux - it is 1-CPU hyperthreaded (Intel P4 630 (P) HT 3.0 GHz with 915GV chipset).
The virtual machine will crash simply by running backup if 2 CPUs are configured. The crash typically occurs after about 3-4GB are written to the (gziped) .tar file.
The virtual machine completes the backup successfully if 1 CPU is configured (leaving the IOAPIC enabled). The full tar file is about 64GB.
I can't rule out the possibility that this is a linux bug. But since linux treats hyperthreads the same as physical processors and the original machine ran for so long without error, first assumption is that something is wrong with the SMP virtualization.
The backup is written to a remote fileserver (nfs mounted from the VM). This is the same as was used for backing up the original server. (The backup script is unchanged.)
Guest is fedora core 6 (220.127.116.11-72); host is fedora core 14. Host has 2 quad-core CPUs, and is running 3 VMs. This VM is the only one configured with more than one CPU.
Failing command (on the virtual machine, via SSH):
tar --totals --one-file-system --xattrs \ --exclude var/www/servers/PhotoGallery/FileCache \ --exclude dev --exclude proc --exclude sys \ --exclude mnt --exclude tmp --exclude var/cache \ -czpf /mnt/backup/overkill/Sat.tar.z * \ | tee /mnt/backup/overkill/Sat.log \ | grep -v ': socket ignored' \ | grep -v 'Total bytes written:' \ | grep -v ': file changed as we read it' \ | grep -v '/.beagle/' \ | grep -v ': Error exit delayed from previous errors'
I have seen these failures running headless (normal for me), and under VirtualBox (for debugging this issue.)
I saw the same failures with the disk chipset set to pix4 and to ich6. (The original server has an ICH6 according to linux device manager, so that's what I picked for the VM.)
I configured a serial port on the VM and captured a log of the panic which, is in the attached tar file. The vbox.log files are also in the file.