VirtualBox

Ticket #12648 (new defect)

Opened 5 years ago

Last modified 4 years ago

Using multiple virtual disks leads to timeouts and crash in client

Reported by: wonko1953 Owned by:
Priority: major Component: virtual disk
Version: VirtualBox 4.3.6 Keywords: dma timeout
Cc: Guest type: BSD
Host type: all

Description (last modified by frank) (diff)

When using multiple virtual disks, I inevitably get DMA timeouts in the client, eventually followed by the client freezing. The only escape then is to kill the client emulation process.

Environment

Host:

  • FreeBSD 9.2.0 amd64 running VirtualBox OSE 4.3.6 (from FreeBSD ports; however, the same behavior has happened for a very long time also with older VB releases) or alternatively
  • Windows 7 (but only tested a while ago with an older VB release showing the same problem)

Devices:

  • 7 raw devices are exported by the FreeBSD host, either attached directly (if VB runs on the FreeBSD host), or via iscsi (if VB runs on Windows 7)
  • 1 x 20 GB attached as IDE0:0 in the VB client
  • 6 x 896 GB attached as SATA0:1-6 in the VB client (the same happens if a SAS instead of an SATA controller is emulated)

Client:

  • FreeBSD 10.0.0 amd64, UFS on 20 GB device, zfs raidz2 on 6 x 896 GB devices. However, the same behavior has been seen with older FreeBSD releases in the client.

Symptoms

After some disk I/O, a DMA timeout is announced in the client OS. This cannot be recovered from. Using "Machine->Reset" from the Client Menu does not work any more. The client process must ultimately be killed.

No problem at all if only one emulated disk is used.

VBox.log

It seems that the following lines in the logfile hint at the problem:

00:32:14.790849 AIOMgr: I/O manager 0x0000080e8eb920 encountered a critical error (rc=VERR_FILE_AIO_NO_REQUEST) during operation. Falling back to failsafe mode. Expect reduced performance
00:32:14.790990 AIOMgr: Error happened in /usr/tmp/z/SRC/FreeBSD-ports/head/emulators/virtualbox-ose/work/VirtualBox-4.3.6/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:(1664){int pdmacFileAioMgrNormal(RTTHREADINT*, void*)}

Attachments

VBox.log Download (59.0 KB) - added by wonko1953 5 years ago.
VBox.log showing defect symptoms
VBox.log.1 Download (99.1 KB) - added by wonko1953 5 years ago.
VBox.log *not* showing defect symptoms
VBox.2.log Download (114.2 KB) - added by wonko1953 5 years ago.
VBox.log from running the same client setup under a Windows 7 host, attaching disks via iSCSI
VBox.log.AIO+VMDK-no_AIO+VMDK Download (12.3 KB) - added by wonko1953 4 years ago.
diff of VBox.log: VMDK + AIO vs. VMDK, no AIO
VBox.log.AIO+VMDK-iSCSI Download (48.9 KB) - added by wonko1953 4 years ago.
diff of VBox.log: VMDK + AIO vs. iSCSI

Change History

comment:1 Changed 5 years ago by frank

Please attach the complete VBox.log file of such a VM session.

comment:2 Changed 5 years ago by frank

  • Description modified (diff)

Changed 5 years ago by wonko1953

VBox.log showing defect symptoms

Changed 5 years ago by wonko1953

VBox.log *not* showing defect symptoms

comment:3 Changed 5 years ago by wonko1953

I started a test run to generate a new VBox.log (I could have used yesterday's but wanted to make sure to be able to reproduce the error) and... no problems even after copying 80 GB of data to the zpool in the VB instance. This produced *VBox.log.1*.

Looking at that file, I noticed these lines in it:

01:37:23.559389 AioMgr0-N: Request 0x0000080be567e0 failed with rc=VERR_TRY_AGAIN, migrating endpoint /dev/ada4p4 to failsafe manager.
01:37:23.625716 AioMgr0-N: Request 0x0000080be5a060 failed with rc=VERR_TRY_AGAIN, migrating endpoint /srcs/test/.VirtualBox/HardDisks/disk35p4.vmdk to failsafe manager.
01:38:26.235968 AioMgr0-N: Request 0x0000080efbd860 failed with rc=VERR_TRY_AGAIN, migrating endpoint /dev/ada2p4 to failsafe manager.
01:38:26.236371 AioMgr0-N: Request 0x0000080ef8e5a0 failed with rc=VERR_TRY_AGAIN, migrating endpoint /dev/ada1p4 to failsafe manager.
01:38:26.463286 AioMgr0-N: Request 0x0000080ec75ca0 failed with rc=VERR_TRY_AGAIN, migrating endpoint /dev/ada5p4 to failsafe manager.
01:38:26.463881 AioMgr0-N: Request 0x0000080ec5e4a0 failed with rc=VERR_TRY_AGAIN, migrating endpoint /dev/ada3p4 to failsafe manager.
01:38:26.661930 AioMgr0-N: Request 0x00000811c635a0 failed with rc=VERR_TRY_AGAIN, migrating endpoint /dev/ada0p4 to failsafe manager.

which are not present in yesterday's log files with the erroneous runs.

So, restart, producing *VBox.log*.

This led to the symptoms described in the original issue nearly immediately - the system hangs with SATA timeouts, and the VB process ultimately has to be killed.

comment:4 Changed 5 years ago by wonko1953

Some notes on the disk setup:

host disk partition VMDK mapping file client attachment client disk size usage in client
/dev/ada4s12 disk35s12.vmdk IDE primary master ada0 20 GB / and /usr UFS filesystems
/dev/ada0p4 disk31p4.vmdk SATA port 0 ada1 896 GB zfs raidz2
/dev/ada1p4 disk32p4.vmdk SATA port 1 ada2 896 GB zfs raidz2
/dev/ada2p4 disk33p4.vmdk SATA port 2 ada3 896 GB zfs raidz2
/dev/ada3p4 disk34p4.vmdk SATA port 3 ada4 896 GB zfs raidz2
/dev/ada4p4 disk35p4.vmdk SATA port 4 ada5 896 GB zfs raidz2
/dev/ada5p4 disk36p4.vmdk SATA port 5 ada6 896 GB zfs raidz2

Therefore, it is interesting to note that the failure messages in VBox.log.1 (remember, this is from the run where there were no problems) reference one host disk twice (ada4p4 == disk35p4.vmdk, FreeBSD in the VB client sees that as ada5). Conversely, the host's ada4s12 (which is seen by the client as ada0 on IDE 0:0) does not exhibit the failure message.

Last edited 5 years ago by wonko1953 (previous) (diff)

comment:5 Changed 5 years ago by wonko1953

But, going back to the failed run, one again sees the following lines in VBox.log:

00:03:20.492586 AioMgr0-N: Request 0x00000811c652e0 failed with rc=VERR_TRY_AGAIN, migrating endpoint /dev/ada0p4 to failsafe manager.
00:03:20.493040 AioMgr0-N: Request 0x00000811c656a0 failed with rc=VERR_TRY_AGAIN, migrating endpoint /srcs/test/.VirtualBox/HardDisks/disk31p4.vmdk to failsafe manager.
00:03:21.279444 AIOMgr: I/O manager 0x0000081189f920 encountered a critical error (rc=VERR_FILE_AIO_NO_REQUEST) during operation. Falling back to failsafe mode. Expect reduced performance
00:03:21.279496 AIOMgr: Error happened in /usr/tmp/z/SRC/FreeBSD-ports/head/emulators/virtualbox-ose/work/VirtualBox-4.3.6/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:(1664){int pdmacFileAioMgrNormal(RTTHREADINT*, void*)}
00:03:21.279572 AIOMgr: Please contact the product vendor

Hopefully this gives an idea about what is wrong.

comment:6 Changed 5 years ago by Hachiman

Does disabling disk cache workaround issue for you?

comment:7 Changed 5 years ago by wonko1953

The error report is for a disabled disk cache both on the emulated IDE as well as the SATA.

As a side note, the issue also occurs with disk cache enabled for IDE but disabled for SATA.

comment:8 Changed 5 years ago by wonko1953

I just ran the same client under a Windows 7 host, with only the attachment changed to use iSCSI to access the disk partitions exported by the FreeBSD 9 host (i.e., there is an iSCSI target daemon running on the FreeBSD 9 host, exporting these disk partitions to the Windows 7 host where the VB client now runs).

I did this twice, and without problems.

So I may have to take back my initial statement that this also happens under Windows 7 as host, although further testing might be in order (I am positive that the problem happened several months ago under this setup).

I am attaching one of the two good VBox.log files for comparison.

Changed 5 years ago by wonko1953

VBox.log from running the same client setup under a Windows 7 host, attaching disks via iSCSI

comment:9 Changed 4 years ago by petr.io

I am seeing this as well:

Running a Ubuntu 14.04 guest on FreeBSD host, and there is hardly any IO during the time of the error. The guest is as simple as it could be - single SATA drive. I've tried enabling the host caching now to see if it stops my trouble. Another VM on the same host is set up with SCSI drive and that one seems to run well for a while.

22:13:17.752113 AIOMgr: I/O manager 0x00000808e12420 encountered a critical error (rc=VERR_FILE_AIO_NO_REQUEST) during operation. Falling back to failsafe mode. Expect reduced performance
22:13:17.767509 AIOMgr: Error happened in /wrkdirs/usr/ports/emulators/virtualbox-ose/work/VirtualBox-4.3.12/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:(1664){int pdmacFileAioMgrNormal(RTTHREADINT*, void*)}
22:13:17.767523 AIOMgr: Please contact the product vendor
22:13:48.120506 AHCI#0: Port 0 reset
22:13:48.120560 AHCI#0P0: Cancelled task 0
22:13:48.120586 AHCI#0P0: Cancelled task 1
22:13:48.120597 AHCI#0P0: Cancelled task 2
22:13:48.120605 AHCI#0P0: Cancelled task 3

comment:10 Changed 4 years ago by wonko1953

FreeBSD bug report  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=168298 might also be caused by this.

comment:11 Changed 4 years ago by mhanor

This bug is related to #13105 (at least the crash part)

comment:12 Changed 4 years ago by wonko1953

Referring to comment 4 above, if I attach the six disks disk31p4..disk36p4 using iSCSI (instead of directly via .vmdk files), there are no problems (and aio is not used except for disk35s12 as can be seen from the log file). This corroborates what I have written in comment 8 above.

The environment is now

  • host: FreeBSD 10 release amd64 (includes a kernel iSCSI implementation)
  • virtualbox: 4.3.20 from FreeBSD ports
  • client: FreeBSD 10 release amd64

Note: I merged

svn diff -r30237:30238  http://www.virtualbox.org/svn/vbox

to  http://www.virtualbox.org/svn/vbox/trunk/src/VBox/Runtime/r3/freebsd/fileaio-freebsd.cpp (which does not yet have this change), but unfortunately this did not help. I am moderately sure that the problem lies in this file, but I have yet to find out where exactly.

Regarding comment 11: That may be related to part of FreeBSD's bug report 168298, but not to this one.

Summarizing, for me AIO on a FreeBSD host works only if just a single disk is attached using AIO.

comment:13 Changed 4 years ago by wonko1953

Another datum: Using vmdks, but not loading FreeBSD's aio kernel module, works.

Again summarizing:

  • vmdks + aio: hangs
  • vmdks, no aio: works
  • iSCSI: works

I'll attach the followings diffs of VBox.log:

  • vmdk + aio <-> vmdk, no aio
  • vmdk + aio <-> iSCSI

Changed 4 years ago by wonko1953

diff of VBox.log: VMDK + AIO vs. VMDK, no AIO

Changed 4 years ago by wonko1953

diff of VBox.log: VMDK + AIO vs. iSCSI

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use