Opened 12 years ago
Closed 12 years ago
#11344 closed defect (fixed)
Extending VDI size destroys XFS filesystem => fixed in svn
Reported by: | Grimeton | Owned by: | |
---|---|---|---|
Component: | virtual disk | Version: | VirtualBox 4.2.6 |
Keywords: | xfs, broken | Cc: | |
Guest type: | other | Host type: | Linux |
Description
Hello,
after resizing a VDI image, while the VM was offline, from 1TB to 1.3TB, the XFS filesystem inside has been destroyed. I talked to Eric Sandeen, from the XFS people, and we ran some tests and in the end the filesystem will always be destroyed.
We're testing on 4.2.4 and 4.2.6 on Linux (amd64) and OS X (10.8.x) hosts with this procedure:
- Power off the VM
- Create a virtual harddisk with 1TB space as VDI
- Attach the virtual harddisk to a port on the sata controller
- Power up the VM
- Create an XFS filesystem on top of the drive (NO PARTITIONS)
- Mount the filesystem and create some directories/files on it like this:
cd /mntpoint && for i in {1..30}; do mkdir $i; cd $i; for j in {a..z}; do touch $j; done; cd ..; done
- Unmount the fs
- Power down the machine
- Change the VDI size like this: vboxmanage modifyhd /path/to/vdi --resize 1331000
- Power the VM on again
- Mount the fs again (no error yet!)
- do a "ls /mountpoint"
- see lots of errors
- run xfs_repair -n and see lots of errors.
So far we can't say if it is a Virtualbox or an XFS issue or a result of the combination of the two. We tested with different image sizes and always got a destroyed filesystem in the end.
Eric will look into this and keep me updated on it.
So far just the warning: DO NOT RESIZE the Virtualbox VDI image if you have XFS inside it.
Happy new year!
KR,
Grimeton
Change History (9)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
Eric digged a bit further and this is what he wrote me last night:
04:40 <sandeen> FWIW, I see it here too.
06:55 <sandeen> the first miscompare is at 0x4000000001
06:55 <sandeen> random? ;)
06:56 <sandeen> anyway, this has to be a vb problem I think, I'm not going to spend more time
on it unless the vb guys need a hand
So it looks like the problem is VB related.
I also ran some tests with a partition on the drive. The results are even more catastrophic as the kernel refuses to mount the filesystem at all. I also see a calltrace of the xfs module (already communicated to Eric).
Eric explained me how the fs works - this is what he told me:
The moment the fs is created, it creates so called AllocationGroups. Those AGs are put on the drive in a predefined gap. The initial value is four. If you create a directory or a file, then XFS puts those new objects into the next AG that is on the counter. So it starts with 0,1,2,3 -> back to 0,1,2,3 and so on.
The superblock on the beginning stores the information of the last block on the disk that is used. So if the disk size is extended XFS will see more space at the end that it can use.
Now the "funny" part:
Eric tells me that XFS sees empty space behind its last block on the device and that's it. There is nothing touched at all until one runs xfs_growfs.
You tell me - if one changes the image size of the VDI only empty space is added at the end ...
Seems like something else is going on, because after the resize (of the VDI image) XFS complains about missing information in the first blocks on the drive which you can see here: http://sprunge.us/SLdi
So it looks like something has been overwritten there.
If you like I can attach the complete chat logs from last night. Maybe that sheds some more light on this.
KR,
Oliver
comment:3 by , 12 years ago
A screenshot from last night that shows that only every fourth ALG is affected: http://i.imgur.com/elHV0.png
comment:4 by , 12 years ago
This is completely unrelated to XFS, please don't get off on the wrong track. :)
I did this:
Create a 1.00T .vdi disk, dynamically allocated, attach it, and boot the guest. In the guest, pattern the disk from 255G to 257G. I used this command, in the guest:
# xfs_io -c "pwrite 255g 2g" /dev/sdb
to do this, but any method to write a pattern to the disk would work. The above writes the pattern "0xcdcdcdcd" into that 2g range from 255g to 257g, as we can see:
# dd if=/dev/sdb bs=1G skip=255 count=2 | hexdump -C 00000000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd |................| * 80000000 2+0 records in 2+0 records out 2147483648 bytes (2.1 GB) copied, 149.839 s, 14.3 MB/s
I then powered off the guest, and resized the .vdi:
# VBoxManage modifyhd 1TDisk1.vdi --resize 1331200
Prior to this I also cloned that disk to have the original for comparison.
I rebooted the guest and then looked at the contents of the resized disk:
# dd if=/dev/sdb bs=1G skip=255 count=2 | hexdump -C 00000000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd |................| * 00100000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00200000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd |................| * 80000000 2+0 records in 2+0 records out 2147483648 bytes (2.1 GB) copied, 211.092 s, 10.2 MB/s
As you can see, a large swath of zeros has now appeared in that range as a result of the disk image resize. If I look at the pre-resize clone (attached to /dev/sdc now), I still see what I expect:
# dd if=/dev/sdc bs=1G skip=255 count=2 | hexdump -C 00000000 cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd |................| * 80000000 2+0 records in 2+0 records out 2147483648 bytes (2.1 GB) copied, 149.839 s, 14.3 MB/s
This seems to clearly be a problem with data on the disk image getting corrupted during a resize, regardless of what the format of that data is...
Thanks, -Eric
comment:5 by , 12 years ago
We should change the title to "destroys filesystem inside".
I guess other filesystems will show the same problem.
KR,
Oliver
comment:6 by , 12 years ago
Summary: | Extending VDI size destroys XFS filesystem → Extending VDI size destroys XFS filesystem => fixed in svn |
---|
Reproduced it with the instructions above. The bug appears if there needs to be more than 1 block rellocated to create enough space for the block allocation table. This will be fixed in the next maintenance release. A workaround for now is to increase the size of the disk in smaller steps to the desired amount. Thanks for the report!
comment:7 by , 12 years ago
aeichner, thanks. Asking on behalf of the original reporter, is there any way to un-mangle an image impacted by the original bug?
comment:8 by , 12 years ago
Thanks for asking Eric, but I already deleted the image and created one twice the size of the available disk space on the host, so I'll never have that problem again ;)
KR,
Oliver
What happens if you create a partition before creating the xfs filesystem? I don't know how the XFS filesystem on disk format looks like but is it possible that the filesystem relies on the size of the block device to find some metadata? If you increase the size of the disk VBox will not touch the data on it but just increase the stored size of the disk and creating more space for the block allocation table.