VirtualBox

Opened 5 years ago

Closed 4 years ago

#18365 closed defect (invalid)

exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

Reported by: VirtualBarista Owned by:
Component: virtual disk Version: VirtualBox 6.0.2
Keywords: Cc:
Guest type: Linux Host type: Linux

Description

Hello,

I am using Fedora 29 as a host, which runs 5 VMs, a mix of Linux and FreeBSD. I never had problems with VirtualBox 5.x, I recently upgraded to 6.0.2.

Yesterday, while working on one of the VMs, something strange happened that I've never seen before.

ALL the VM consoles suddenly printed SATA errors, at the same time.

Here are the relevant parts of /var/log/messages from three of those VMs.

VM1

Jan 26 03:03:07 vm1 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 26 03:03:07 vm1 kernel: ata1.00: failed command: FLUSH CACHE
Jan 26 03:03:07 vm1 kernel: ata1.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 7#012         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 26 03:03:07 vm1 kernel: ata1.00: status: { DRDY }
Jan 26 03:03:07 vm1 kernel: ata1: hard resetting link
Jan 26 03:03:07 vm1 kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 26 03:03:07 vm1 kernel: ata1.00: configured for UDMA/133
Jan 26 03:03:07 vm1 kernel: ata1.00: retrying FLUSH 0xe7 Emask 0x4
Jan 26 03:03:08 vm1 kernel: ata1.00: device reported invalid CHS sector 0
Jan 26 03:03:08 vm1 kernel: ata1: EH complete

VM2

Jan 26 03:09:10 vm2 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 26 03:09:10 vm2 kernel: ata1.00: failed command: FLUSH CACHE
Jan 26 03:09:10 vm2 kernel: ata1.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0#012         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 26 03:09:10 vm2 kernel: ata1.00: status: { DRDY }
Jan 26 03:09:10 vm2 kernel: ata1: hard resetting link
Jan 26 03:09:11 vm2 kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 26 03:09:11 vm2 kernel: ata1.00: configured for UDMA/133
Jan 26 03:09:11 vm2 kernel: ata1.00: retrying FLUSH 0xe7 Emask 0x4
Jan 26 03:09:11 vm2 kernel: ata1.00: device reported invalid CHS sector 0
Jan 26 03:09:11 vm2 kernel: ata1: EH complete

VM3

Jan 26 03:03:08 vm3 kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 26 03:03:08 vm3 kernel: ata1.00: failed command: FLUSH CACHE
Jan 26 03:03:08 vm3 kernel: ata1.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 17#012         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 26 03:03:08 vm3 kernel: ata1.00: status: { DRDY }
Jan 26 03:03:08 vm3 kernel: ata1: hard resetting link
Jan 26 03:03:08 vm3 kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 26 03:03:08 vm3 kernel: ata1.00: configured for UDMA/133
Jan 26 03:03:08 vm3 kernel: ata1.00: retrying FLUSH 0xe7 Emask 0x4
Jan 26 03:03:08 vm3 kernel: ata1.00: device reported invalid CHS sector 0
Jan 26 03:03:08 vm3 kernel: ata1: EH complete

I tried to send a reboot command via ssh and out of the five VMs only two responded, the other three wouldn't accept connections. I tried the "ACPI Shutdown" option directly, but they still wouldn't respond or reboot. Eventually I had to forcefully turn them off.

On reboot, everything seems back to normal.

Here is a part of the VBox log

00:00:01.048370 VD#0: Cancelling all active requests
00:00:01.048446 VD#0: Cancelling all active requests
00:00:10.824386 VD#0: Cancelling all active requests
00:00:10.828057 VD#0: Cancelling all active requests
74:34:19.969034 VD#0: Flush request was active for 29 seconds
74:40:22.518388 VD#0: Cancelling all active requests
74:40:22.518414 VD#0: Request{0x007fcddc19d740}:
74:40:22.970057 VD#0: Flush request was active for 61 seconds
74:40:22.970084 VD#0: Aborted flush returned rc=VERR_PDM_MEDIAEX_IOREQ_CANCELED
91:03:11.731269 FPUIP=00000000 CS=0000 Rsrvd1=0000  FPUDP=00000000 DS=0000 Rsvrd2=0000
91:03:11.731485 FPUIP=00000000 CS=0000 Rsrvd1=0000  FPUDP=00000000 DS=0000 Rsvrd2=0000

All VMs are stored on the host as a RAID1 array (ext4).

Change History (1)

comment:1 by aeichner, 4 years ago

Resolution: invalid
Status: newclosed

Please reopen if still relevant and attach full VBox.logs from your VMs, not just some small excerpts. To me this looks like the host OS couldn't cope with the I/O load induced by the VMs resulting in long running I/O requests which triggered timeouts in the guest.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use