#15831 closed defect (fixed)
vm state corupted because failed snapshot deletion
Reported by: | tim43263246 | Owned by: | |
---|---|---|---|
Component: | other | Version: | VirtualBox 5.1.4 |
Keywords: | Cc: | ||
Guest type: | all | Host type: | Linux |
Description
Hi,
i tryed to delete a snapshot in switched out state but than i couldnt start the vm anymore because it says that some file is missing. as i looked into dmesg i saw:
[362393.820323] DeleteSnap[29354]: segfault at 70 ip 000000000070a181 sp 00007fe4b6837940 error 4 in VBoxSVC[400000+489000]
so it looked like the process responsible for deleting a snapshot crashed. Could you please fix the problem that if this process crashes it is leading to a corrupt state where you cant start the vm anymore and thell me how to fix my corrupt vm now?
thanks
Attachments (2)
Change History (20)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
I had the same problem. I posted to the forum thread "Discuss the 5.1.4 release". What I wrote follows. I am commenting on this ticket instead of opening a new one for the same problem.
I had a problem with snapshots in 5.1.4 that was not present in 5.1.2. I'm running VirtualBox 5.1.4 on Ubuntu 16.04 LTS. I have one VM that has 5 disks. If I try to remove a snapshot on that VM, the UI gets stuck saying it's deleting the snapshot, but it never finishes. If I close and restart VirtualBox, it claims a disk is missing. The snapshot was removed for the first disk, but not the other 4. The VM appears to still have the snapshot, but it can't be removed. I see the following in dmesg output on my system:
Aug 19 16:56:10 ubuntu-oryx kernel: [14500.247312] DeleteSnap[21019]: segfault at 31 ip 000000000053574c sp 00007f00ce01d950 error 4 in VBoxSVC[400000+49c000]
I completely removed and reinstalled VirtualBox 5.1.4, but that didn't solve the problem. I got the VM back by manually editing its configuration file, then adding the virtual disks back. The only way I could get snapshots working for this VM was to downgrade to VirtualBox 5.1.2. That version has no problem with snapshots on this VM. I can reproduce this in VirtualBox 5.1.4 by just taking a snapshot and then deleting that same snapshot (the only snapshot) immediately.
The 5.1.4 version didn't have this problem with another VM that only has one virtual disk however. VirtualBox 5.1.2 works all the time.
All I have to do to reproduce this on Ubuntu 16.04 LTS is to create a new VM with two virtual disks, snapshot them, then delete the snapshot. I attached gdb to the VBoxSVC process and saw this when it crashed (I know this probably isn't very helpful):
[rowland@ubuntu-nuc ~]$ sudo gdb /usr/lib/virtualbox/VBoxSVC 10043 GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/lib/virtualbox/VBoxSVC...(no debugging symbols found)...done. Attaching to program: /usr/lib/virtualbox/VBoxSVC, process 10043 [New LWP 10045] [New LWP 10046] [New LWP 10047] [New LWP 10048] [New LWP 10049] [New LWP 10068] [New LWP 10069] [New LWP 10096] [New LWP 10097] [New LWP 10100] [New LWP 10103] [New LWP 10107] [New LWP 10181] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f511c426d13 in select () at ../sysdeps/unix/syscall-template.S:84 84 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) cont Continuing. [New Thread 0x7f510b789700 (LWP 10313)] Thread 15 "DeleteSnap" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f510b789700 (LWP 10313)] 0x000000000053574c in ?? () (gdb) bt #0 0x000000000053574c in ?? () #1 0x0000000000535939 in ?? () #2 0x0000000000535f68 in ?? () #3 0x0000000000535fa1 in ?? () #4 0x0000000000513c43 in ?? () #5 0x00000000005593f2 in ?? () #6 0x000000000055eaf4 in ?? () #7 0x00000000004b057a in ?? () #8 0x00007f511d4bf5ec in ?? () from /usr/lib/virtualbox/VBoxRT.so #9 0x00007f511d548e7b in ?? () from /usr/lib/virtualbox/VBoxRT.so #10 0x00007f511d8296fa in start_thread (arg=0x7f510b789700) at pthread_create.c:333 #11 0x00007f511c430b5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb)
I repeated the experiment and collected the strace output on the same process (following child processses, etc.) in the attachment. Again, I have no idea how useful that is. This is a pretty big problem for my work (though that is definitely not a complaint - VirtualBox is both free and awesome). I just hope to help make it better somehow. Also, I did try the latest testing build as well. It has the same problem. Thanks!
by , 8 years ago
Attachment: | strace-output added |
---|
strace output following all child processes when VBoxSVC segfaults deleting a snapshot
comment:3 by , 8 years ago
I see this bug isn't owned by anyone. Is there another similar bug number (I can't find it via search) that I can look at to see if anyone is working on it?
As far as other functionality, things seem to be going well in the latest 5.1.5 testBUilds like Build 110598. The thing for me as noted above is that I can't reliably delete snapshots (well, not all of them) on any guest VM that has 2 or more .vdi disks, and I'm running win 7 pro 64 host. Guests are 1) win 10 64, and 2) ubuntu 16.04 64.
I tried a vboxmanage command for deleting the snapshots and this did work on one snapshot, but it failed on the next so it's not apparently a good answer to anything.
Thanks.
comment:4 by , 8 years ago
I have the same problem on Ubuntu 15.10 using both the .deb install as well as the "All Platforms" script.
comment:5 by , 8 years ago
Confirming same problem on 6.1.6. Is there any Win 7 64 strace utility that is recommended for helping with debug? Thanks.
comment:6 by , 8 years ago
This problem appears to be a serious regression introduced in 5.1.4 and persisting in 5.1.6. I can't reproduce it on 5.1.2 (and have reverted to that release to get around the problem).
It also occurs on Windows 8.1 Pro (up-to-date with all Microsoft updates as of this update) hosts. It is 100% repeatable on my PC but I don't have another machine I can test it on right now.
As noted above, it only occurs when there are two or more virtual HDDs.
I've only seen it when deleting the first snapshot on the machine - for example if Snap1, Snap2, and Snap3 are taken in that order, Snap2 and Snap3 can be deleted w/o problems. A workaround therefore MIGHT be to create a snapshot immediately after creating a VM and then never delete that snapshot.
The problem doesn't even require starting the VM (although it still happens in more realistic scenarios where the VM is started/stopped). The guest OS does not seem to matter (I've seen it with Ubuntu 16.04 "Live CD mode", Ubuntu 16.04 installed onto the disks, and Knoppix 7.6.1). In fact, it isn't even necessary to have a bootable system -- the simple reproducible case below exploits this.
When VBoxSVC.exe crashes while deleting the snapshot, the .vbox file is left unchanged and still reflects the snapshot. According to 'showmediuminfo', the base vdi and the snapshot vdi for the first disk have the same relationship after the crash as before -- but the snapshot vdi file's state is 'inaccessible' and the file is, in fact, missing on disk. According to 'showmediuminfo', the base vdi and the snapshot vdi for the second disk have the same relationship after the crash and both the base and snapshot vdi file still exist.
A representative EventLog event for the crash is:
Faulting application name: VBoxSVC.exe, version: 5.1.6.10634, time stamp: 0x57d6d545 Faulting module name: VBoxSVC.exe, version: 5.1.6.10634, time stamp: 0x57d6d545 Exception code: 0xc0000005 Fault offset: 0x00000000000d10b0 Faulting process id: 0x23a0 Faulting application start time: 0x01d211488790ac06 Faulting application path: c:\Program Files\Oracle\VirtualBox\VBoxSVC.exe Faulting module path: c:\Program Files\Oracle\VirtualBox\VBoxSVC.exe Report Id: c58778b0-7d3b-11e6-8306-74d4351740fe Faulting package full name: Faulting package-relative application ID:
The following distilled sequence of commands creates the problem every time on my PC on 5.1.4 & 5.1.6 but not on 5.1.2 -- the last command causes the VBoxSVC crash:
vboxmanage createvm --name TestSnap --basefolder J:\Virtual_Machines\VirtualBox --ostype Linux_64 --register vboxmanage modifyvm TestSnap --memory 2048 --acpi on --ioapic on --mouse usbtablet vboxmanage storagectl TestSnap --add sata --name SATA vboxmanage createhd disk --size 1000 --filename J:\Virtual_Machines\VirtualBox\TestSnap\disk-1 vboxmanage storageattach TestSnap --storagectl SATA --type hdd --port 0 --device 0 --medium J:\Virtual_Machines\VirtualBox\TestSnap\disk-1.vdi vboxmanage createhd disk --size 1000 --filename J:\Virtual_Machines\VirtualBox\TestSnap\disk-2 vboxmanage storageattach TestSnap --storagectl SATA --type hdd --port 1 --device 0 --medium J:\Virtual_Machines\VirtualBox\TestSnap\disk-2.vdi vboxmanage snapshot TestSnap take Snap01 vboxmanage snapshot TestSnap delete Snap01
comment:7 by , 8 years ago
I was just reading your line about how you only see it when deleting the first snapshot. I actually have the problem each time on the first snapshot, and sometimes on others. Recently I was able to delete a couple of 2nd & 3rd snapshots using vboxmanage, but unless you can delete them all, there is no order-based-workaround for me.
I've also done this on non-setup systems just using a livecd along with any type of two virtual disks. I haven't seen it at all on a single vdi vm, but anytime I create two, whether or not I think I've used them, I get the hang + crash + corruption.
I'm a bit surprised that this doesn't seem to be on anyone's radar. I suppose I'll roll back to 5.0.12 (or whichever one I used to use that had the least problems) and use that.
comment:8 by , 8 years ago
I have this same problem, and it is a serious one. My main use case for VB is to try changes and revert them. All my VMs have multiple disks, so I cannot confirm if it works OK with a single disk system.
comment:9 by , 8 years ago
I also have the same problem. Documented in forum https://forums.virtualbox.org/viewtopic.php?f=6&t=79809.
VM VirtualBox Manager GUI v5.1.4 and v5.1.6 crash when deleting snapshot from a Win7 guest with 3 HDs on SATA Controller.
Deleting snapshots for the same VM was working ok under VBox 5.1.2, so i went back to v5.1.2
Host: Win7Pro 64bit Guest: Win7Pro 64bit (3 HDs on SATA Controller)
comment:10 by , 8 years ago
Also experiencing this problem.
Host: macOS Sierra 10.12, VirtualBox 5.1.6
Guest: FreeBSD 64-bit
Two vDisks on SATA, one snapshot. Attempt to delete the snapshot results in some work, then the progress bar disappearing and the first vDisk being listed as being listed as "Differencing, Inaccessible" in the VirtualBox Manager. Also disk is listed as "{UUID string}.vdi" instead of its actual name "vDisk1.vdi". It appears to be the snapshot name? In the "Snapshots" folder there is only one entry, which appears to belong to the second disk.
comment:11 by , 8 years ago
The same situation, 2 discs, after snapshot deletion a whole VM was corrupted and had to be deleted. VM VDIs were in inconsistent state and had to be deleted also. Two days of VM settings are gone :-(
Host Win7 64 Pro Guest Win7 64 Pro SATA, 2 HD, 1 Optical
comment:12 by , 8 years ago
I tested myself and it's still broken in 5.1.7-111038 test build. Maybe you guys are too busy with other critical issues and did not have time to fix this small problem.
comment:13 by , 8 years ago
I'm hoping that Oracle Vbox workers aren't belittling "no snapshot deletion with > 1 VDI" as unimportant. I never hear anything about progress on this issue.
I'm very willing to be patient about problems with open software, but I don't want this to fall off the radar.
Thanks
comment:14 by , 8 years ago
I encounter this problem, too.
Host: Windows 10, VirtualBox 5.1.6 Guest: Lubuntu 16.04.1
My virtual machine also has two VDIs.
comment:15 by , 8 years ago
Don't worry, snapshot issues (like everything which puts the user's data at risk) will not fall off the radar.
Currently the testbuild upload is running... check https://www.virtualbox.org/wiki/Testbuilds - any 5.1 builds with revision 111231 or later should be working again.
This particular issue was a 5.1.4 regression caused by a tiny behavior change as part of a cleanup, converting to one code base for task management.
comment:16 by , 8 years ago
@#15: Klaus : Thanks for the headsup on 111231. I'll give that a try and report back later on today.
Cheers
Edit:: Build 111231 : Tests:
Host: Win 7 SP1 Pro 64: Guest1: 4 VDI version of Ubuntu 16.04 LTS up to date and 64 bit. Guest2: 3 VDI version of Win 10 (most recent fast-ring build) up to date, 64 bit:
Result: I deleted 7 4 vdi snapshots on Ubuntu and 3 on Win 10.
No problems. Thanks for the info and good work to the team! Many thanks.
Edit 2: Tested with Build 111271 this AM and all seems well with snapshots , other issues.
follow-up: 18 comment:17 by , 8 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fix is part of VBox 5.1.8. Please open separate tickets for unrelated issues.
by , 6 years ago
Attachment: | ResourceMonitor.png added |
---|
comment:18 by , 6 years ago
Replying to frank:
Fix is part of VBox 5.1.8. Please open separate tickets for unrelated issues.
Workaround for V5.1.4, Windows hosts: (not 100% sure but looks optimistic after a couple of tries) - whatever you do, wait that access of VBoxSVC.exe stops accessing your .vmdk files! So i.e. if you delete a snapshot, wait for these accesses to dissapear from the list in Resource Monitor before doing other actions that might have an impact or be impacted by them snapshots.. After you delete the snapshot and do the prescribed waiting, close and open the VBox Manager. You will still see the state of "deleting snapshot". Redo the procedure - delete the snapshot again. It should be fine now.
ok i was now able to fix the vm but not without the loss of all snapshots ;( you should realy fix that bug