VirtualBox

Opened 7 years ago

Last modified 3 months ago

#16450 new defect

VM locks up when setting --discard and --nonrotational and using fstrim

Reported by: Motlib Owned by:
Component: virtual disk Version: VirtualBox 5.1.14
Keywords: SSD fstrim discard Cc:
Guest type: Linux Host type: Linux

Description

I changed the settings of my Xubuntu VM to emulate an SSD:

VBoxManage storageattach Xubuntu --storagectl SATA --port 0 --discard on --nonrotational on

After that started the VM. Everything runs fine until running fstrim in the VM. First the VDI file is shrinked a bit, then VirtualBox starts continuously reading from host filesystem (I guess from the VDI file). After some time the VM guest OS reports storage problems (timeouts) in dmesg output. No access to emulated storage is possible after that, but VM continues to run (e.g. scrolling in windows, moving mouse, ...). See attached VM screenshots and VBox.log for details.

I already observed the same behavior with a Windows 10 VM which locked up in the same way, but then I did not record any logs or error information.

Problem seems to be similar to ticket #15509.

Host info: Kubuntu 16.10 64 Bit, VirtualBox 5.1.6 (installed from Kubuntu repository)

Guest info: Xubuntu 16.10 64 Bit, 1 SATA controller, 1 VDA file with dynamic size

Please tell if any more information is needed.

Attachments (5)

VirtualBox_Xubuntu_04_02_2017_10_01_14.png (228.3 KB ) - added by Motlib 7 years ago.
dmesg output in VM, part 1
VirtualBox_Xubuntu_04_02_2017_10_02_11.png (205.2 KB ) - added by Motlib 7 years ago.
dmesg output in VM, part 2
VirtualBox_Xubuntu_04_02_2017_10_02_27.png (250.7 KB ) - added by Motlib 7 years ago.
dmesg output in VM, part 3
VBox.log (80.6 KB ) - added by Motlib 7 years ago.
VBox.log
VBox.2.log (81.7 KB ) - added by Mihai Hanor 6 years ago.
Windows 10 64-bit 1709 as host

Download all attachments as: .zip

Change History (10)

by Motlib, 7 years ago

dmesg output in VM, part 1

by Motlib, 7 years ago

dmesg output in VM, part 2

by Motlib, 7 years ago

dmesg output in VM, part 3

by Motlib, 7 years ago

Attachment: VBox.log added

VBox.log

comment:1 by Motlib, 7 years ago

In the guest info above is a typo. Of course it's a VDI file.

Now I also discovered the same behavior with Windows 7 64 Bit as host OS, Kubuntu 16.10 64 bit as guest os and VirtualBox version 5.1.14r112924. VirtualBox again starts continuously reading the VDI file and the guest OS becomes unresponsive.

comment:2 by rui.godinho.lopes, 6 years ago

I'm also seeing this lockup when using a Windows 2016 guest (https://github.com/rgl/windows-2016-vagrant) running in Ubuntu 18.04 and in a Windows 10 host. Not using the --discard argument makes things work again.

comment:3 by Mihai Hanor, 6 years ago

This issue is very easy to reproduce. Just try to install Windows 10 or Debian 9 on such a VM, using a medium that's attached to a SATA controller, with the nonrotational and discard parameters. The VirtualBox VM process stops responding to controls.

Last edited 6 years ago by Mihai Hanor (previous) (diff)

by Mihai Hanor, 6 years ago

Attachment: VBox.2.log added

Windows 10 64-bit 1709 as host

comment:4 by yuhp, 4 years ago

I can avoid hang with any of:

  1. Use PIIX4 storage controller type.
  2. Use HOST I/O Cache.

Windows 10 host. I think problem from concurrent VDI access. IDE controller not support multiple commands and Windows cache hide/fix concurrent access to VDI file.

comment:5 by soruk, 3 months ago

I've been having this issue, irrespective of host platform and I/O settings. Under Linux, lsblk gives this:

[root@NetBox ~]# lsblk -D
NAME               DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
sda                       0      512B       2G         0
├─sda1                    0      512B       2G         0
└─sda2                    0      512B       2G         0
  ├─netbox-root           0      512B       2G         0
  ├─netbox-swap           0      512B       2G         0
  ├─netbox-opt            0      512B       2G         0
  ├─netbox-var            0      512B       2G         0
  ├─netbox-var_log        0      512B       2G         0
  └─netbox-var_lib        0      512B       2G         0
sr0                       0        0B       0B         0
[root@NetBox ~]# _

The disc driver times out while trying to discard such large blocks, the default size when using the SATA controller.

Newer Linux kernels allow this to be changed within the guest, such that I use this script to lower the DISC-MAX size to 4MB:

#!/bin/bash
MAXDISCARD=4194304
cd /sys/block/$1
echo $MAXDISCARD > queue/discard_max_bytes
for X in $(find . -name 'dm-*') ; do
  echo $MAXDISCARD > $X/queue/discard_max_bytes
done

At this lower level the TRIM calls do not time out. I suspect being smaller blocks the guest OD SATA driver gets responses back from the emulated hardware much sooner so timeouts do not occur. Indeed, with this lowered level I find TRIM to be reliable. But this requires the guest OS to allow this to be changed, the Linux 3.x.x kernel of CentOS 7 does not allow this so I do not enable TRIM on older OSes.

Is there any way to make VirtualBox's SATA emulation to report a lower maximum discard size?

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use