VirtualBox

Ticket #17573 (closed defect: fixed)

Opened 9 months ago

Last modified 4 weeks ago

Regression: "Use Host I/O cache" setting seems to be inverted in 5.2 => fixed in SVN/next maintenance

Reported by: Michi Owned by:
Priority: critical Component: virtual disk
Version: VirtualBox 5.2.6 Keywords:
Cc: Guest type: Linux
Host type: Linux

Description

I believe that between 5.1 and 5.2 some change led to an inversion of the "Use Host I/O cache" setting.

I mark this bug as critical as people that explicitly disable Host I/O caching to ensure data integrity in case of a power outage may now actually have Host I/O caching enabled and are in danger of loosing data in power outage/reset situations.

Using VirtualBox for testing system installations I activate the Host I/O caching setting to increase performance when installing a lot of RPMs in the virtual machine, particularly because the VDI file on the host resides on a mechanical HDD. When switching from VirtualBox 5.1 to 5.2, I noticed a considerable slowdown.

I measured the slowdown with an ansible playlist that brings a minimal system (installed with Fedora kickstart) to a production level system. The script spends most time on installing RPMs.

Some stats of the RPM installation: Number of RPM packages before: 324 Number of RPM packages after: 1490 Total file size of all RPMs before: 734 MiB Total file size of all RPMs after: 3.9 GiB

It is Fedora (guest) on Fedora (host).

This is with the old VirtualBox 5.1, where everything worked as expected. Installation run on VirtualBox 5.1.32, "Use Host I/O cache" activated:

real 7m38.936s user 0m19.646s sys 0m3.937s

Going to VirtualBox 5.2 the effect of the setting seems inverted: Installation run on VirtualBox 5.2.6, "Use Host I/O cache" activated:

real 52m17.548s user 1m22.884s sys 0m17.828s

Installation run on VirtualBox 5.2.6, "Use Host I/O cache" deactivated:

real 10m3.771s user 0m22.089s sys 0m5.119s

And here I confirm that the test version is still affected: Installation run on VirtualBox 5.2.7, "Use Host I/O cache" activated:

real 37m57.730s user 1m13.114s sys 0m15.521s

Installation run on VirtualBox 5.2.7, "Use Host I/O cache" deactivated:

real 11m1.805s user 0m29.377s sys 0m5.790s

Attachments

MichiClubFedora-2018-02-27-18-52-18.log Download (127.9 KB) - added by Michi 9 months ago.
Last Benchmark run on VirtualBox 5.2.7 with host I/O cache disabled

Change History

Changed 9 months ago by Michi

Last Benchmark run on VirtualBox 5.2.7 with host I/O cache disabled

comment:1 Changed 8 months ago by Michi

Bug still present in VirtualBox 5.2.8 - UEFI

Ansible playlist runtime (the ansible playlist is bit modified compared to the other one above, but is still dominated by installing rpms):

Use Host I/O Cache off

real 19m58.398s
user 0m44.042s
sys 0m8.213s


Use Host I/O Cache on

real 48m44.685s
user 1m18.772s
sys 0m17.870s

So it still seems the setting is inverted.

comment:2 Changed 4 months ago by simono74

Copy pasting my comment from bug #17746 since both bug seem to reference the same issue:

I also noticed this issue with Virtualbox 5.2.14. I have a couple of CentOS servers running on Virtualbox inside a Windows 10 host for testing purposes. I use a script to automatically configure them all. The configuration can be pretty disk intensive on the computer. In Virtualbox 5.1.38, the script takes about 4 minutes to complete. With Virtualbox 5.2.14, it takes 12 minutes. This seems like a major performance regression in Virtualbox 5.2. I also have "Use Host I/O Cache" switched on. Even if I turn off this setting, the VMs are still slower than when I was running on 5.1.38 without "Use Host I/O Cache". I am guessing that this regression is caused by the "first milestone of the I/O stack redesign" that was introduced in 5.2.0.

comment:3 Changed 4 months ago by pupertest

This is not an inversion. In my benchmarking I found the default uncached I/O in 5.2.x to have about the same performance as before (in 5.1.x). It's the "use host I/O cache" option that is broken. I'll link my forum thread here:  https://forums.virtualbox.org/viewtopic.php?f=6&t=87459&p=421483#p421675

comment:4 Changed 4 months ago by socratis

@Michi, @pupertest

I'm going to label pupertest's thread as to include the ticket #, so that people in the forums can link to the ticket.

comment:5 Changed 2 months ago by hwertz

I've been digging through the VirtualBox source code (5.2.18 and 5.1.38). I do believe I've found the root cause -- IgnoreFlush (which defaults to being on) does not work in 5.2, so any IDE or SATA flush results in fsync() on the VirtualBox VDI or VMDK file.

In src/VBox/Devices/Storage/DrvVD.cpp, IgnoreFlush is checked in drvvdFlush() (if IgnoreFlush is true, the rest of the drvvdFlush() function is skipped and VINF_SUCCESS is returned). drvvdFlush() is (was!) called via call to "pThis->IMedia.pfnFlush".

However, this is no longer used! (Or at least not used for the typical async I/O case.)

Instead, pThis->ImediaEx.pfnIoReqFlush is called, resulting in use of drvvdIoReqFlush function. No ignoreflush checks here! It calls drvvdMediaExIoReqFlushWrapper which uses 1 of 3 methods to write any internal buffers out to the OS cache as needed and then generate an fsync().

I think it'd be appropriate to put a ignoreflush check in either drvvdIoReqFlush or drvvdMediaExIoReqFlushWrapper.

Keep up the good work! -Henry

Last edited 2 months ago by hwertz (previous) (diff)

comment:6 Changed 2 months ago by aeichner

Great catch, thanks a lot for digging into this! Can you please try the most recent testbuild from here please? It should fix the issue.

comment:7 Changed 2 months ago by hwertz

Looks good! Installed, I booted up a few VMs (Ubuntu 18.04 VMs in this case). With 5.2.18 I could see (on the host) with gkrellm almost continuous disk writes and with /proc/meminfo dirty blocks would get up to like 4-10MB then get flushed. With 5.2.19 r125117 I see no discernible write activity in gkrellm and dirty blocks going up and up (as it should, I recently popped 16GB of RAM into this system so I'm sure the OS disk caches are quite large.) Thanks!

--Henry

comment:8 Changed 8 weeks ago by aeichner

  • Summary changed from Regression: "Use Host I/O cache" setting seems to be inverted in 5.2 to Regression: "Use Host I/O cache" setting seems to be inverted in 5.2 => fixed in SVN/next maintenance

Thanks for testing, glad that it works for you now! This will be part of the next official maintenance release.

comment:9 Changed 4 weeks ago by Michi

Just tested 5.2.19-125724 and it also fixes the problem on my side. Lots of thanks!

comment:10 Changed 4 weeks ago by michael

  • Status changed from new to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use