VirtualBox

Opened 8 years ago

Last modified 4 years ago

#15374 new defect

Virtual HDD becomes unavailable for guest : with AHCI#0: Port x reset

Reported by: sylvain Gplservice Owned by:
Component: other Version: VirtualBox 5.0.14
Keywords: AHCI port reset hdd unavailable Cc:
Guest type: Linux Host type: Linux

Description

It looks a like #9975 in symptoms but with linux/linux host/guest and no windows involved. Also errors logs are not as much verbose.

The guest runs find for a (what looks like) random duration ranging from a few weeks to one year then without clear causes one virtual disk becomes unavailable and creates sda i/o errors on guest with one only line in the Vbox.log file :

1900:51:41.454153 AHCI#0: Port 2 reset

A kill $pid + VboxManage startvm restores every thing to normal until next time.

I have seen this behavior on virtualbox version back to 4.3.6r91406 and up to 5.0.14. When the crash occurs, system load on host isn't particularly high, i/o aren't particularly high neither on host or guest. I've seen it on SATA rotative disks in RAID 1 and 5, SATA SSD disks raid 1

Guests linux kernels ranges from 2.6.18 up to 3.2.x Guest hosts are Linux debian 6.0 7.0 or 8.0 All vdi files are used as SATA controled drives

I keep a log of crashed VMs to try discovering common criteria to narrow it down, but nothing really jump to the eye beside :

  • low load/low i/o VMs don't seam to crash
  • On some host server with different hardware I never had any crash

I know my report is quite poor in details, but I'll try to keep it updated. During the mean time is there a way to make those "virtual disk reset" more verbose ?

Attachments (4)

debian-start (1.3 KB ) - added by sylvain Gplservice 8 years ago.
VBox.log (75.2 KB ) - added by sylvain Gplservice 8 years ago.
vbox log file with AHCP port reset
info-ahci (1.1 KB ) - added by sylvain Gplservice 8 years ago.
output of VBoxManage debugvm <VM name> info ahci0
ahci-virtualbox-disconnect.txt (1.6 KB ) - added by sylvain Gplservice 8 years ago.

Download all attachments as: .zip

Change History (12)

by sylvain Gplservice, 8 years ago

Attachment: debian-start added

by sylvain Gplservice, 8 years ago

Attachment: VBox.log added

vbox log file with AHCP port reset

comment:1 by sylvain Gplservice, 8 years ago

please ignore or remove "debian-start" attachement, it's a mistake I can't seam to revert myself

comment:2 by Frank Mehnert, 8 years ago

Could you provide a core dump of the VM process when this happens?

comment:3 by aeichner, 8 years ago

Doing a "VBoxManage debugvm <VM name> info ahci0" might be helpful too as it gives a quick overview about the state of the AHCI controller emulation.

comment:4 by sylvain Gplservice, 8 years ago

I have restarted my VMs after issuing :

  $ echo -n 1 > /proc/sys/fs/suid_dumpable 

I'll wait (might take a few months !) until it happens again and then I'll provide a core dump and the output of "VBoxManage debugvm <VM name> info ahci0"

by sylvain Gplservice, 8 years ago

Attachment: info-ahci added

output of VBoxManage debugvm <VM name> info ahci0

comment:5 by sylvain Gplservice, 8 years ago

The crash append on one of the VM and here is the output of VBoxManage debugvm <VM name> info ahci0

However, there isn't anywhere a dumped "core" file. The procedure at https://www.virtualbox.org/wiki/Core_dump explains how to create one when running VM with following command : $ "virtualbox -startvm <vm name>" and I assumed that would be the same with $ VBoxManage startvm <vm name> --type headless

but it doesn't seam to be the case. (the core file seams only created with running virtualbox)

I'm using virtualbox in an headless environnement and it is unpractical to have VM output displayed. Is there anything else I can do to have a core dump with $ VBoxManage startvm <vm name> --type headless ?

by sylvain Gplservice, 8 years ago

comment:6 by sylvain Gplservice, 8 years ago

A new output of VBoxManage debugvm <VM name> info ahci0

The virtualbox software was upgraded to 5.0.20r106931 and the bug still occurs (rarely)

Any news on the way to get a core dump on a headless setup ?

comment:7 by sylvain Gplservice, 4 years ago

For the record, in Virtualbox 5.1 (at least) the problem still persist (I haven't tested with 6.0 or 6.1 yet) I also have windows (fewer) guests on Linux host and using SATA as disk drives triggers the problem as well. Wich makes me think the problem is not guest related.

On a windows guest, using old Virtualbox : VirtualBox VM 4.2.10 r84104 linux.amd64 (Mar 5 2013 13:37:15) release log The log shows : 783:09:48.334496 AHCI#0P1: Cancelled task 6 783:11:20.549345 AHCI#0: Port 1 reset 783:11:20.564215 AHCI#1: Canceled write at offset 3093704704 (4096 bytes left) returned rc=VINF_SUCCESS 783:17:56.028435 AHCI#0: Port 1 reset 783:18:56.528014 AHCI#0: Port 1 reset 783:24:51.371946 AHCI#0: Port 1 reset 783:28:51.559754 AHCI#0: Port 1 reset 783:29:52.059510 AHCI#0: Port 1 reset 783:31:00.809397 AHCI#0: Port 1 reset 783:38:04.715720 AHCI#0: Port 1 reset 783:39:25.590470 AHCI#0: Port 1 reset 783:40:26.090818 AHCI#0: Port 1 reset

But the good news is I found a workaround : If the bug is ended in the SATA/AHCI code, I switched to a SAS disk controler and now the problem is gone.

Hopefully, Linux has support for such controlers and switching works without modification. Unfortunelty, that isn't the case for my Windows Host

comment:8 by Feline, 4 years ago

I have also had a (very) similar problem with my Windows 7 Guest(s). For me, it had only occurred on my ThinkCentre machines. I have other machines with the same Host/Guest combination but no errors. In my case, whilst the external symptom was the same, the action within the Windows 7 guest was dependant on which SATA driver I had installed in the guest. With the default Microsoft driver, the guest would hang completely. Using the latest Intel drivers (11.2.0.1006) for that virtual SATA controller type, the drives would go offline within the guest but no hang. I too changed to using the LSI SAS 1068 controller in the guest and I never had any further problems. I have been running with that configuration since March 2018.

The LSI SAS 1068 drivers are still available for Windows 7 from the Broadcom website. Use the following URL to search for the Legacy products section. The drivers are actually listed under the LSI SAS 3800X section. For Windows 7 the latest driver is version 1.34.3.0

https://www.broadcom.com/support/download-search?pg=Legacy+Products&pf=Legacy+Products&pn=LSI+SAS+3800X&pa=&po=&dk=&pl=

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use