VirtualBox

Opened 6 years ago

Last modified 5 years ago

#17477 new defect

Compacting VDI file resulting in corrupt file system on High Sierra (APFS)

Reported by: DanielH Owned by:
Component: VM control Version: VirtualBox 5.2.6
Keywords: compact, corrupt, file, vdi, mac, high sierra Cc:
Guest type: Windows Host type: Mac OS X

Description (last modified by janitor)

I tried to compact a VirtualBox VDI file (Windows 10 Guest) with this commands on High Sierra 10.13.2 (17C205):

  1. VBoxManage modifymedium disk XXX.vdi --compact
  2. VBoxManage modifyhd XXX.vdi --compact

Both commands won´t finish and resulting in an corrupt file system (APFS) on the host.

After that I´m not able to show the VDI file or navigate to the VM folder which contains the VDI file. Finder crashes if I want to do that, Terminal also crahses and Time Machine Backups can´t finish if Time Machine passes that folder for backups...

I had to boot into recovery mode to repair the filesystem of my main Macintosh HD, after that I was able to remove the corrupt VDI file...

This is the second time I had that issue, fortunately I was able to repair my file system this time, the last time with macOS High Sierra 10.13.0 and VirtualBox 5.1.30 I ending with and corrupt file system and was not able to repair it, this means I completely had to restore my Mac...

Change History (5)

comment:1 by aeichner, 6 years ago

I don't see how we can be made responsible for bugs in APFS. The filesystem should never get into an inconsistent state by operations done through the standard file APIs which we use here. You should blame Apple for forcing APFS down the users throats...

comment:2 by 6pac, 5 years ago

FWIW, I'm also getting this problem regularly. I'm on a laptop with limited SSD drive space, and not being able to shrink is causing real problems.

I acknowledge that AFPS bugs are not your problem, but I suspect the problem may be triggered by the VBoxManage tool 'not finishing' due to it's taking a very long time to complete.

For example, on my main VM, I have three VDI files. The smallest (10GB) one shrunk in a few minutes. The next largest (40GB) got to 90% on the command line output in about 4 minutes, but then stayed there for about an hour before actually completing. I'm currently trying to shrink the main VDI (150GB). It got to 90% in about 10 minutes, and it's been sitting there for around 8 hours now. I'm just hoping that it does eventually complete, because I know that if it doesn't reformatting my drive and recovering off time machine is the next step, and that's going to take about 4 hours and there goes most of my working day :-(

Realistically I can't dedicate more than about 12 hours to this process.

comment:3 by ChrisJ60, 5 years ago

This problem still exists in VirtualBox 6.0.4 running on macOS Mojave 10.14.4. While I agree that the VirtualBox development team are not responsible for APFS bugs I do think they need to take ownership of this, report it to Apple and work on a solution with Apple. It's very annoying to repeatedly corrupt ones boot disk (necessitating a format and full restore more often then not) when trying to shrink a VDI file and most users will blame this on VirtualBox not Apple regardless of where the problem really lies.

comment:4 by janitor, 5 years ago

Description: modified (diff)

comment:5 by moehrmoehr, 5 years ago

Hi, another user here who ran into this. (Technical details below this paragraph.) Things will probably end well in my case, but I am quite disappointed with how this issue has been handled by the VirtualBox devs. I completely understand that implementing APFS properly is Apple's responsibility, and that you are not to blame if Apple's code corrupts the filesystem. But if you are aware that your tool reproducibly causes host filesystem corruption that cannot always be repaired, it's not appropriate or responsible to say "it's not our bug" and do nothing else *for 20 months*. You could add a warning to the command's documentation that points out this known problem. You could add a check to modifymedium --compact to error/warn when trying to use it with a VDI on an APFS volume. You could report this to Apple - they would probably be interested in hearing about FS corruption bugs caused solely by standard userspace file operations! Just help us users to not corrupt our filesystems using your software.

Anyway, here's the detailed description of what happened, in case it is of use to anyone.

The host is a MacBook Pro 13" Mid 2012 running macOS 10.14.5 and VirtualBox 6.0.12. I was in the same situation as the OP - I had a Windows 10 VDI that I tried to compact, and the progress stopped after 80% (or at least it took unusually long). At some point after that, I was no longer able to run ls -l in the directory with the VDI - the command hung forever and could not be stopped with Ctrl+C, or even with a SIGKILL. The --compact operation could not be stopped either, so I tried to shut down my host system - which also hung, so I had to force shutdown by long pressing the power button. (Even Ctrl+Cmd+Power did not work, which normally functions as a slightly softer reset than long-pressing Power.)

After that I booted into recovery mode and ran a repair on the APFS volume. The repair process showed a few errors ("error: directory valence check: directory (old 0x13): nchildren (1) does not match drec count (0)", and a second error which I unfortunately did not write down before closing the log). I'm not sure if it actually performed any repairs, but the repair process claimed to be successful. After rebooting the problem persisted, trying to ls the directory in question hung with the same behavior as before. Repairing the volume a second time fixed the problem for some reason.

(Not sure if this matters, but when running the repairs, I set Disk Utility to also display physical drives and APFS containers, rather than the default setting that only displays volumes. I ran the first repair on the APFS container that contained the volume with the broken VDI, and the second repair on the specific APFS volume inside the container. Maybe that causes the tool to run slightly different repairs, which would explain why the second repair was necessary.)

After the second repair, my APFS volume *seems* to be fine again based on some quick tests, but there's no way for me to tell if everything is actually okay again now. For all I know, the filesystem might still be corrupted in some non-obvious way that I will only notice months later. My only safe option here is to go through the long process of reformatting the volume and restoring my system - which I thankfully can, as I have multiple backups, and even the possibly-corrupted volume is still mostly readable.

Last edited 5 years ago by moehrmoehr (previous) (diff)
Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use