Ticket #3156 (closed defect: fixed)
Linux (Debian) host and conflicting UUIDs (VBoxSVC sync issue)
|Reported by:||corvus||Owned by:|
|Version:||VirtualBox 2.1.2||Keywords:||semaphore Linux VBoxSVC sync syncronization poweroff|
Description (last modified by frank) (diff)
There seems to be a bug in syncronization objects on linux hosts. We have been seeing a bug in VBox 1.6, 2.0, 2.1.0, 2.1.2. Reproduced on Debian Lenny 2.6.25-amd64, 2.6.26-amd64, 2.6.25-i686.
I haven't reported the bug for few months just because I found a workaround with UUID patching (see below) and it was hard to describe where the bug was. Now I have more info and that's why posting full details.
- create at least 10 machines by cloning the same VDI file and creating the same settings.
- start machines in ascending order (by order of creation)
- shutdown machine by machine in descending order. Check if shutting down one machine may cause changing state of other (running) machines.
For instance, powering off machine number 8 may cause changing state of machine number 5 to "Aborted". The pairs of conflicting machines remain the same each start you start/stop machines. If you discovered the conflicting pair you may reproduce the bug by starting and shutting down just these 2 machines (but keep the order).
Playing around with the bug showed that the problem is connected to machine UUIDs and semaphores used for synchronization of the VirtualBox process and VBoxSVC (see details below). Therefore, just by changing UUID of one of conflicting machines the problem seems to disappear. But at the same time when UUID is changed there might appear another conflict with other machine in the set.
Looks like there is some semaphore which id is generated basing on machines UUID. The hashing function for creating semaphore id seems to be the key problem. I believe it is inside VBoxSVC module but haven't found yet.
I started machine N5, then started machine N8. Powered off machine N8 and machine N5 got into 'Aborted' state same moment. The VirtualBox window for machine N8 disappeared but the process was still running in the background.
I have attached to the VirtualBox process for machine N8 with gdb and checked the stack backtrace. You may see it in the attachment. There is a reference to a sourcecode: src/VBox/Main/SessionImpl.cpp (line 860). Seems like machine N8 got stuck at this point:
I hope this helps!