VirtualBox

Ticket #6023 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

VHD Delete Snapshot results in corrupt virtual hard disk or double snapshot revert

Reported by: MarkCranness Owned by:
Priority: critical Component: virtual disk
Version: VirtualBox 3.1.2 Keywords: Restore Delete Snapshot
Cc: Guest type: other
Host type: other

Description

1) VHD double snapshot - Restore:

Restore Snapshot (snap2) immediately followed by Delete Snapshot (snap2) has the same effect as Delete Snapshot (snap2) followed by Restore Snapshot to the parent Snapshot (snap1). Any changes made between snap1 and snap2 are lost.

Repro steps:

  • Start with a VM having a VHD file attached that has no snapshots. (I used the Windows XP Mode VHD extracted from the 32-bit Windows XP Mode download: MD5sum= 5189623a8e5c6ff518cdd4759037f109, newly attached to a new VM and not yet booted.)
  • Take a snapshot 'Snapshot 1'
  • Boot the VM and make a change (perhaps create a file).
  • Shutdown VM
  • Take a snapshot 'Snapshot 2'
  • Select 'Snapshot 2' and 'Restore Snapshot' immediately followed by 'Delete Snapshot'
  • Boot VM and note that your change has gone: The VM has been restored back to 'Snapshot 1'

2) Virtual hard disk corruption causes BSOD and/or registry repair messages at boot.

Repro steps:

  • Start with a VM having a VHD file attached that has no snapshots. (I used the Windows XP Mode VHD extracted from the 32-bit Windows XP Mode download: MD5sum= 5189623a8e5c6ff518cdd4759037f109, newly attached to a new VM and not yet booted.)
  • Take a snapshot 'Snapshot 1'
  • Boot the VM and make a change (perhaps create a file).
  • Shutdown VM
  • Take a snapshot 'Snapshot 2'
  • Boot the VM then shutdown.
  • Select 'Snapshot 2' and 'Delete Snapshot'
  • Boot VM and note that it BSOD's or has other corruption problems.

Attachments

Windows XP Mode.xml Download (20.1 KB) - added by MarkCranness 4 years ago.
WindowsXPMode after second 'Take Snapshot', before Delete Snapshot 2
VBox.log Download (66.7 KB) - added by MarkCranness 4 years ago.
Boot - install - GA added, second snapshot taken
Snapshots.dir Download (582 bytes) - added by MarkCranness 4 years ago.
Directory listing of Snapshots folder after Snapshot 2
Windows XP Mode.2.xml Download (11.9 KB) - added by MarkCranness 4 years ago.
After delete of Snapshot 2
Snapshots.after_delete.dir Download (566 bytes) - added by MarkCranness 4 years ago.
After delete of Snapshot 2 (Oops! why is the VHD so small => bug!)

Change History

comment:1 Changed 4 years ago by MarkCranness

In the step 'Boot the VM and make a change (perhaps create a file)' above, I actually proceeded thru the initial XPMode setup and then installed the Guest Additions. Another ticket says Delete Snapshot works OK if the changes are small, and installing GA is not small.

comment:2 Changed 4 years ago by marcoben73

Some problem for me. I deleted a snapshot and I lost my windows guest. My version is 3.1.2.r56127

comment:3 Changed 4 years ago by marcoben73

I switched back to 3.0.12.r54655 and there no problem discarding snapshot.

comment:4 Changed 4 years ago by aeichner

The second problem looks like #5981 to me. Unfortunately I wasn't able to reproduce this issue so far. Is this 100% reproducable for you or does it work sometimes?

comment:5 Changed 4 years ago by MarkCranness

100% reproducable. Both 1 and 2 I reproduced 3 times each, and it never worked correctly: it always failed. (2) looks like 5981 to me also.

comment:6 Changed 4 years ago by joolsr

I'm afraid I've had the same thing happen. using v3.1.2 and r56127 on Ubuntu Karmic, after removing snapshots, and moving to another, I've twice had massive amounts of disk corruption on my linux guests.

I know that snapshots have been reworked, and i'm pleased they have been, but they currently seem too easily broken with dire consequences.

I think there should have been more testing ... disk corruption is a pretty major problem ...

comment:7 Changed 4 years ago by brendanl79

Just writing to confirm I too observed corruption after deleting snapshots. It's a 32-bit WinXP guest on 64-bit Fedora12 host. I had a straight line of snapshots about a dozen long, deleted five or so from within this line, then upon booting the "current state" got some registry errors and broken applications. Have not yet found any ruined data, but I'm going to lose some time reinstalling those apps. And have learned a hard lesson that VM snapshots, at least in their current form, do not obviate the need for backups.

comment:8 Changed 4 years ago by joolsr

I'm surprised that Sun havent really looked into this any more than they have. It may well be that its only a problem for some, but either way having mass corruption on your VirtualMachines is pretty bad ...

comment:9 Changed 4 years ago by frank

As aeichner wrote, Sun actually did look into this problem but was not yet able to reproduce it.

comment:10 Changed 4 years ago by frank

  • Component changed from other to virtual disk

comment:11 Changed 4 years ago by aeichner

Unfortunately I'm still not able to reproduce the corruption issue. Tried it with trunk and 3.1.2 on Linux and OS X. Can all of you please attach the xml configuration file and a VBox.log of the affected VMs? I want to replicate the VM configuration as good as possible.

comment:12 Changed 4 years ago by joolsr

I can send you them, but would rather not attach the files in a public area - pls let me know if there is somewhere else i can send them to

comment:13 Changed 4 years ago by brendanl79

I too would prefer a private submission. But in the meantime I am happy to report how the corruption manifested itself:

Again, this was on the "current state" after deleting about half the snapshots in a straight line of descent. I have not dared try booting an earlier snapshot at this point.

A number of executables (Firefox, Textpad) gave a popup saying "not a valid Windows image" or something similar when I tried to run them. Reinstalled Firefox and Textpad, problem went away.

MSVCRTD.DLL was corrupt, also I think some executables of Visual Studio 6.0 and 2008 C++ Express. Reinstalled these apps, problem went away.

On login to my Windows domain account I get a message about how registry settings had to be recovered from a backup and it was successful. Tried multiple registry repair utilities, none could find anything.

Oddly, have not yet found any of my data files to be corrupt. But this has been a frightening experience to say the least.

comment:14 Changed 4 years ago by aeichner

You can send them to Alexander (dot) Eichner (at) sun (dot) com Thanks for your help.

comment:15 Changed 4 years ago by joolsr

I have mailed the logs and xml configs as requested.

comment:16 Changed 4 years ago by MarkCranness

USer 'Winchester' reports in :  http://www.virtualbox.org/ticket/6097, "I could also reproduce the problem from #6023 on my systems."

comment:17 Changed 4 years ago by zapolsky

I have same/similar bug.
VHD is corrupted after attempt to delete some snapshot.

Steps to reproduce:
1) Install any system into VirtualBox (I used OpenSolaris)
2) Make first snapshot
3) Make a lot of changes inside just installed OS (I updated OpenSolaris snv_129(?) to snv_132 and removed several zones)
4) Make second snapshot
5) Fill free space on root filesystem of your real OS. For instance, leave 1Gb of free space
6) Try to delete first snapshot

Actual Result: VHD is corrupted and cannot be used later.

Expected Result: VirtualBox should check amount of free space required for delete/merge or similar operation and don't allow to make anything if free space is not enough.

comment:18 Changed 4 years ago by atpetro

I have downloaded and confirmed the same problem with 3.1.4. I can reproduce this every time. The easiest way to build the same snapshots and witness corruption.

1. Download the ready-to-go Windows 2008 VHD from MS: http://www.microsoft.com/downloads/details.aspx?FamilyID=764b531e-4526-4329-80b5-921fd3297883&displaylang=en
2. Create a snapshot and install something big - the latest Visual Studio 2010 RC is a good candidate: http://msdn.microsoft.com/en-us/vstudio/dd582936.aspx
3. Delete/Merge the snapshot
4. Reboot and start to see errors about missing files for IIS and other system files.  You will likely also see a check disk on boot and it will find lots of missing files.

comment:19 Changed 4 years ago by hankhero

I just had a fatal corruption after deleting several snapshots for a .VMDK file, I previously used VMWare, so perhaps it is not VHD specfic.

Host: Virtualbox 3.1.4 r57640 Mac Snow Leopard Guest: Ubuntu 9.10

comment:20 Changed 4 years ago by jimmykang

I too notice this on .vmdk files

comment:21 Changed 4 years ago by jdimatteo

This just occurred to me this morning after deleting two snapshots. (Note that I had plenty of free disk space available when deleting the snapshots.)

comment:22 Changed 4 years ago by joolsr

So how many of use have had broken corrupted VM's now ? I hope Oracle (Sun ) are doing something about this .... surely you must be able to replicate the issue with the notes in the reports here ...it needs fixing quickly ... i dare not remove any snapshots nowadays, and have lost some confidence in what is otherwise a good product

comment:23 Changed 4 years ago by sandervl73

The Windows 2008 VHD is marked read-only after extraction. That might lead to failures to delete the snapshot. Make sure the read-only attribute is cleared and try again.

comment:24 Changed 4 years ago by atpetro

The VHD readonly flag was cleared immediately and definitely was not set before merge. The merge finished in the UI (which it will not do when marked read only). The resulting VHD is corrupt. Please continue your testing and create a snapshot that is very big by installing something big. Then try to merge. Then you will definitely be corrupted. It happens every time for every single person on my team. This is not hard to replicate if you follow the steps by the many posters here. I have a full VHD snapshot tree that I'm happy to FedEx to you. It really is that serious of an issue.

comment:25 Changed 4 years ago by sandervl73

I have downloaded the Windows 2008 VHD as you suggested, but that one doesn't boot here due some VHD spec change apparently. A colleague of mine is looking at it right now.

comment:26 Changed 4 years ago by thetechguide

I experience the same VHD corruption after doing a merge & delete. I can replicate the problem. My host is Win7 ultimate 64bit fully patched; guest is XP SP3 32bit. I replicated the problem on 3.1.2 and also 3.1.4. After a merge & delete that completes successfully according to VBox (no errors); when I start the virtual machine; there are many corrupt files. The files that are corrupted are different; the first time with 3.1.2; hal.dll and numerous Microsoft office dll files were corrupt/missing. The guest would start and get to the XP desktop; I could try & start various applications if I move quickly; they all give some generic error that files are missing/corrupt. Then the "generic Win32 process error" appears and the XP permanently hangs. Upon forcing shutdown and boot; I would receive the "hal.dll is missing or corrupt; please re-install.

I restored my system from backups back to before I did the merge/delete; verified my VHD was running properly (opened programs, rebooted it, etc.). All is well. I then upgraded to VBox 3.1.4, and tried the merge/delete again.

This time the XP VHD won't even start it bluescreens; I forced shutdown & upon booting it; there is the immediate hal.dll is missing or corrupt.

I restored my system again back to before the merge/deletes & the vhd is running fine & booting without the corrupt hal error. I am now using 3.1.4 with my snapshots running fine as long as I don't merge & delete.

My VHD is approximately 9Gb; I have 200Gb free on my hard drive. I have privately emailed my log file to the person listed above in this thread.

I have privately emailed my log files to the person listed above: Alexander (dot) Eichner (at) sun (dot) com Thank you.

Changed 4 years ago by MarkCranness

WindowsXPMode after second 'Take Snapshot', before Delete Snapshot 2

Changed 4 years ago by MarkCranness

Boot - install - GA added, second snapshot taken

Changed 4 years ago by MarkCranness

Directory listing of Snapshots folder after Snapshot 2

Changed 4 years ago by MarkCranness

After delete of Snapshot 2

Changed 4 years ago by MarkCranness

After delete of Snapshot 2 (Oops! why is the VHD so small => bug!)

comment:27 Changed 4 years ago by MarkCranness

Steps were as follows (3.1.2):

  • Start with a VM having a VHD file attached that has no snapshots. (I used the Windows XP Mode VHD extracted from the 32-bit Windows XP Mode download: MD5sum= 5189623a8e5c6ff518cdd4759037f109, newly attached to a new VM and not yet booted. This was extracted from the MS download using 7Zip and was not / is not read-only. NOTE custom set ExtraDataItem VBoxInternal/Devices/pcbios/..., which I have emailed separately to Alexander (dot) Eichner (at) sun (dot) com.)
  • Take a snapshot 'Snapshot 1'
  • Boot the VM and let XP configure itself. It reboots
  • Ignore all "New hardware found" messages and install Guest Additions. Reboot. GA working correctly
  • Shutdown VM
  • Take a snapshot 'Snapshot 2'
  • Select 'Snapshot 2' and 'Restore Snapshot'
  • At this point I uploaded the machine xml, VBox.log and Snapshots.dir
  • Select Snapshot 2 and 'Delete Snapshot'
  • I get a message dialog: Merging ... {ea3a78e7-bbdc-4f73-aee8-79f885bb5c26}.vhd ... Not good!
  • Oops, Snapshots.after_delete.dir does not look good. The VM Current State differencing disk is too small
  • I upload Windows XP Mode.2.xml and Snapshots.after_delete.dir
  • Boot VM and note that all changes have gone: The VM has been restored back to 'Snapshot 1'
  • Boot VM and XP re-configures itself from scratch.

comment:28 Changed 4 years ago by aeichner

Thanks for the detailed instructions, I'll have a look. We created a testcase here which randomly creates and merges snapshots and so far it didn't showed any problems.

comment:29 Changed 4 years ago by aeichner

MarkCranness, thanks a lot! I can reproduce the problem, investigating.

comment:30 Changed 4 years ago by aeichner

The corruption issue should be fixed in the next maintenance release. It would be great if a few of you could give it a try before the release. If you want to test the fix send me a mail (Alexander (dot) Eichner (at) sun (dot) com) and I'll send you a test build. Thanks all for your help and patience!

comment:31 Changed 4 years ago by frank

  • Status changed from new to closed
  • Resolution set to fixed

Fixed in 3.1.6.

comment:32 Changed 4 years ago by henry74

  • Status changed from closed to reopened
  • Resolution fixed deleted

This is not fixed as I just experienced the error with the latest version of VirtualBox 3.2.2. I'm not sure the steps to recreate it, but I deleted a few snapshots in between my most recent snapshot and the first snapshot. When I booted into Windows it started running chkdsk due to corruption. It run for about an hour, fixing all the errors.

After the Windows XP guest came back up (running Linux Host), ALL my programs had problems with DLLs across the board. I'm basically going to have to reinstall. I would recommend not using snapshots until this problem is address.

comment:33 Changed 4 years ago by aeichner

Which image format do you use? Is the image stored on a ext4 filesystem and do you use a SATA, SCSI or SAS controller in the VM? Is the "Use host I/O cache" setting unchecked?

comment:34 Changed 4 years ago by aeichner

  • Status changed from reopened to closed
  • Resolution set to fixed

No response, closing.

comment:35 Changed 3 years ago by matthew72

this is NOT fixed. deleting a snapshot still trashes the vm in the latest 4.0.4 release, both VHD and VDI harddisks, both v3 (.xml) and v4 (.vbox) machines. there are however many other similar bug reports, so this is probably a duplicate.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use