VirtualBox

Ticket #12982 (closed defect: obsolete)

Opened 5 years ago

Last modified 3 years ago

OpenSuse 12.3 Linux Host - FreeBSD 9.2 client - Reset initiated by ACPI problem

Reported by: p166 Owned by:
Component: guest control Version: VirtualBox 4.3.10
Keywords: ACPI reset Cc:
Guest type: BSD Host type: Linux

Description

Dear All!

A have a big problem I cannot get through. I have a machine ( the processor type can see below ) with 32GB memory with Opensuse 12.3 installed on it with kernel 3.13.0 ( uname -r : 3.13.0-1.g4b6e17a-desktop ). I have installed 4 FreeBSD 9.2 system on it with 2 processors and 2-3-4GB ram respectively on each virtual machine. The host uses an Intel Corporation 82574L Gigabit Network Connection, FreeBSD hosts all have Intel Pro/1000 MT Desktop ( 82540EM) bridged virtual devices.

My problem is: Virtual machines work well for approximately 5-9 hours ( sometimes less, sometimes 1-2 days ) and it resets and starts again. Work for hours ( sometimes 1-2 days ) and resets. A VBox.log for the corresponding machine contains these rows:

02:05:10.166051 Reset initiated by ACPI 02:05:10.166100 Changing the VM state from 'RUNNING' to 'RESETTING'. 02:05:10.166365 CPUM: SetGuestCpuIdFeature: Enabled APIC 02:05:10.166376 PIT: mode=3 count=0x10000 (65536) - 18.20 Hz (ch=0) 02:05:10.166998 PIIX3 ATA: Ctl#0: finished processing RESET 02:05:10.167009 PDMR3Reset: after 0 ms, 1 loops: 1 async tasks - piix3ide/0 02:05:10.167022 PIIX3 ATA: Ctl#1: finished processing RESET 02:05:10.196435 Changing the VM state from 'RESETTING' to 'RUNNING'. 02:05:10.199088 Guest Log: BIOS: VirtualBox 4.3.4

And all starts again. I have tried VirtualBox 4.3.4, 4.3.6 and 4.3.10 but they did not helped. It makes more difficult that there is no any error in host's and guest's logfiles at all...... I do not really know what is happening.

I have two other machines with the same hardware and software configurations. They absolutely work fine. Only one difference exists between them is the kernel. One of them has a 3.12.0, second has 3.12.6 and this resetting one has 3.13.0. Each of the two other machines have OpenSuse 12.3, VirtualBox 4.3.4 and FreeBSD 9.2 clients too.

A have googled a lot and only one answer I have found relating to this problem and it says that there's no enough memory left for the host machine. Is has always more than 10 GBs free memory. I have 32GB physical RAM in the host and my four virtual guests uses 4 x 4GB memory. I do not think it was the problem.

The VBox forums could not have any answer. Maybe they did not encounter this problem.

The network traffic is permanently high on these FreeBSD clients and the host too. The highest is about ~40-50MBit/s. ACPI reset used to happen at night when the network load is low. We think this is NOT the cause.

First we made a 8 hour memtest, then we changed power supply and mainboard, after that change it went more than 44 days long without any problem. after this period the nightmare began again and here it is.

I attached the full VBox.log file to this post which contains the rows above.

Here is the snippet of 'cat /proc/cpouinfo' command:

vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz stepping : 9 microcode : 0x15 cpu MHz : 3709.187 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 6784.51 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:

Attachments

VBox.log Download (59.5 KB) - added by p166 5 years ago.
Reset initiated by ACPI problem

Change History

Changed 5 years ago by p166

Reset initiated by ACPI problem

comment:1 Changed 5 years ago by p166

Can anyone help on this issue?

comment:2 Changed 5 years ago by frank

You should really check your guest kernel logs. "Reset initiated by ACPI" means that the guest intended to reboot. The reason for the reboot is unknown but it's definitely guest-initiated. Perhaps some kind of watchdog? Also make sure that if you run 3 VMs with 2 VCPUs per VM, don't run all three VMs with 100% in parallel, otherwise you intend to run 6 activities on 4 host cores and this wouldn't work without timing problems.

comment:3 Changed 5 years ago by p166

Dear Frank!

Thanks for your reply but I do not understand you say. I'm sure it is my fault.

We have 7 VMs with 2 vCPUs each. The cumulated processor usage do not reach 3 physical processor load at the highest daily usage. Each machine uses approximately 5-10 percent CPU anytime I look at them. If I have 4 cores with Hyper-threading how much vCPUs can I allocate for these VMs?

We do not have anything in the guest logs. None of them has any relating information on the issue. We just see their "starting state".... and a normal operating loglines. And it starts.

We have two other machines with ABSOLUTELY the same hardware layout but not the same Linux kernel on the host ( others: 3.12.0, 3.12.6 - BVox: 4.3.4, this machine: 3.13.3, 4.3.10 ) and they runs about the same amount of VMs with the same "hardware" layout ( controllers, CPU-numbers, etc ). They absolutely works well.

  • Is it possible the this problem is caused by the "Use host IO cache" option enabled? This machines read a lot from disk but write almost zero amount of data.
  • Is it possible that the physical processor buggy? Need change?
  • These machines makes a high volume of network traffic. Is it possible that high network traffic causes ACPI reset on the guest without any messages in the log? Just reset?

We are absolutely helpless now. We've been gone through a five-month problem-searching procedure but we found nothing at the end. Thanks for your kindness and helpful answers.

comment:4 Changed 5 years ago by p166

Hi!

There's no overloaded processors and the reboots are continuing on a day by day basis. We have changed ALL the hardware in the machine from the mainboard to the processor. No success.

Is it a VirtualBox bug / problem? VirtualBox do not endure the high network traffic? The two other machines are over the 150 days uptime now but they only have a middle load of cpu and a very low network load. This one alsa have a middle cpu and a high ( 100 - 150 MBit / s ) network load.

Does anyone encounter similar problem or has high network traffic servers with VirtualBox?

Thanks in advance

comment:5 Changed 5 years ago by mhanor

You have to approach the problem as you would with a real machine. You have to first determine if it's a panic of the guest kernel. From what I can see, FreeBSD kernels have a reboot-on-panic feature, depending on the compile-time options/kernel parameters set by sysctl.

 http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-options.html

Version 1, edited 5 years ago by mhanor (previous) (next) (diff)

comment:6 Changed 5 years ago by p166

Yes, there is an option. But it restarts the machine without any clue, without any panic. The only thing I can see in the machine's VBox.log file is "Reset initiated by ACPI" nothing more. There is nothing in the host's log, there is nothing in the guest's log at all.

It is also weird that virtual machines can crash at the late of the night when they usually do absolutely nothing! They all finished backing up processes, most of the people are sleeping deeply, so they do not care about our site.

And also they can reset in the morning or afternoon when the network load is very high. I can cause the reset by rsyncing or copying files from or to the virtual machine. Sometimes it only takes a minute sometimes after an hour. The nightly backup goes by rsync with --bwlimit=3000 option because with a higher bwlimit value it resets...... WITHOUT ANY REASON. No error message, no any warning or notice. The next message I can see after reset is syslog starting messages and kernel starting messages.

How can I make FreeBSD more "verbose" on this issue? How can I get more information on what happens in FreeBSD's mind and soul.

Or do you know something about Linux kernel 3.13 realting to this problem what can cause such a behaviour? Is it reasonable to upgrade host Linux kernel?

comment:7 Changed 5 years ago by mhanor

Don't think about the host. You have to debug the guest OS kernel. Ignore the fact that it's running under a VM. You'll have to do some research on the guest OS, on how to do kernel debugging, enable kernel crash dumps, enable the kernel debugger, and so on. Read the Kernel Debugging chapter from the FreeBSD Developers' Handbook, starting here:

 http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html

comment:8 Changed 5 years ago by p166

Thank you for your efforts. I turned on dumpdev and dumpdir in rc.conf and restarted the machine. I am waiting for the "results" and if I have something valuable I'll be back.

comment:9 Changed 3 years ago by aeichner

  • Status changed from new to closed
  • Resolution set to obsolete

Please reopen if still relevant with a recent VirtualBox release.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use