Opened 10 years ago
Closed 7 years ago
#13956 closed defect (fixed)
VM hangs in AHCI when host busy since 4.3.x
Reported by: | Bernd Hohman | Owned by: | |
---|---|---|---|
Component: | virtual disk | Version: | VirtualBox 4.3.24 |
Keywords: | AHCI, Host I/O Cache | Cc: | |
Guest type: | Linux | Host type: | Linux |
Description
When the host disk (RAID-5 with slow HD) is moderate busy, several Debian Guests with AHCI and enabled Host I/O Cache stop working.
Today it happened between 11:05-11:15am GMT to 6 Guest in parallel. Other Guests on same Host with AHCI and disabled Host I/O Cache complained about AsyncCompletion but continued running.
No information in the Guest logfiles (syslog, kern.log), only VBox.log. No information in the Host logfiles (syslog, kern.log).
More information on "not working VMs": 'VBoxManage list runningvms' shows this VM as running. 'VBoxManage controlvm acpipowerbutton' shows no effect. 'VBoxManage controlvm poweroff' claims in VBox.log to have stopped the VM, but the command hangs displaying "0%...10%...20%...30%...40%..." and must be interrupted by C, then killing the VBoxHeadless process with "pkill".
All Guests were restarted around 02:30 GMT tonight and crashed around 11:10 GMT. Hanging VMs were restarted around 13:30 GMT
Description of attached files (renamed VBox.log files):
1) bad_blog.log: Guest with AHCI + Host I/O Cache problem. Hang after 08:33.32 runtime, then you can see the poweroff request.
2) bad_dns1.log: Guest with AHCI + Host I/O Cache problem. Hang after 08:33.59 runtime. Guest process was stopped with 'pkill', thats why it doesnt show the last logline with "AHCI#0P0: Canceled write at offset" as in bad_blog.log
3) good_oso.log: Guest with AHCI + no Host I/O Cache. Shows some AsyncCompletion problems but continues working
4) good_test.log: Same like (3)
5) Host version info
We removed Host I/O Cache on all Guests in the past which simply moved the problem to AsyncCompletion problems and crashes (and caused a very bad memory usage too).
Attachments (5)
Change History (11)
by , 10 years ago
Attachment: | bad_blog.log added |
---|
comment:1 by , 10 years ago
This happens on all of our 3 Hosts after we upgraded from 4.2.x to 4.3.x. so it seems a regression. I'm switching to IDE host controller where ever its possible and will report the result.
comment:3 by , 10 years ago
comment:4 by , 10 years ago
The cause is the same, regardless if it's a bad sector or a timeout due to overload.
comment:5 by , 10 years ago
Well, at least changing to IDE doesnt help:
18:30:26.476015 PIIX3 ATA: Ctl#0: RESET, DevSel=0 AIOIf=0 CmdIf0=0xca (87557414 usec ago) CmdIf1=0xa0 (-1 usec ago) 18:30:30.050297 PIIX3 ATA LUN#0: Async I/O thread probably stuck in operation, interrupting 18:30:31.588167 PIIX3 ATA LUN#0: Async I/O thread probably stuck in operation, interrupting 18:30:31.598047 PIIX3 ATA LUN#0: Async I/O thread probably stuck in operation, interrupting 18:30:54.313827 PIIX3 ATA: execution time for ATA command 0xca was 115 seconds 18:30:54.313846 PIIX3 ATA: Ctl#0: finished processing RESET
(Guest disk went to read-only after this I guess)
comment:6 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Please reopen if still relevant with VBox 5.1.22.
AHCI + Host I/O Cache crashed