VirtualBox

Ticket #8231 (closed defect: fixed)

Opened 12 years ago

Last modified 12 years ago

X Segmentation fault on startx -> fixed on trunk as of 2011-02-04

Reported by: rbhkamal Owned by:
Component: guest additions Version: VirtualBox 3.2.12
Keywords: Cc:
Guest type: Linux Host type: Windows

Description

Occasionally, vboxvideo_drv.so crashes X with segmentation fault at address 0xc. Unfortunately, I haven't been able to reproduce the problem on a system that allows switching to the console (so I can copy the logs). However, I was able to take a screen shot of the error.

The workaround, is when X fails to start, restart the guest services and reload the vboxvideo_drv module then try to start X again.

The guest is a *live* stripped down Ubuntu 10.04.1 running linux-kernel-x86-2.6.32-28-generic and X server version 1.7.6

The problem seems to get worse when the guest takes longer to boot, however, following the workaround above always works.

See attached for crash trace.

fyi, I'm currently requesting for permission to provide you with the ISO or maybe an export of the virtual machine.

Attachments

vboxdrv_crash.png Download (31.8 KB) - added by rbhkamal 12 years ago.
Screenshot of the segmentation fault
Xorg.0.log.old Download (27.4 KB) - added by rbhkamal 12 years ago.
The first failed attempt to start X
Xorg.0.log Download (26.1 KB) - added by rbhkamal 12 years ago.
Second attempt to start X works
logs.7z Download (13.9 KB) - added by rbhkamal 12 years ago.
Some more logs (udev, casper.log) and output of lsmod and ps -ef after the second attempt to start X
startgui.sh Download (223 bytes) - added by rbhkamal 12 years ago.
This shell is start by another shell which is started by rc.local

Change History

Changed 12 years ago by rbhkamal

Screenshot of the segmentation fault

comment:1 Changed 12 years ago by rbhkamal

Forgot to mention that this happens with 3.2.12 3.2.10 and 3.2.8. Please let me know if you need the ISO. Or if there is anything you'd like me to try.

Thanks, RK

comment:2 Changed 12 years ago by michael

If you are able to continue using the machine without rebooting then the log should still be there.

comment:3 Changed 12 years ago by rbhkamal

Alright, I'm in the process of getting some logs. Its very tricky nice the guest OS locked-down and doesn't allow switching to the console, and it doesn't have a file manager nor an X terminal.

Changed 12 years ago by rbhkamal

The first failed attempt to start X

Changed 12 years ago by rbhkamal

Second attempt to start X works

Changed 12 years ago by rbhkamal

Some more logs (udev, casper.log) and output of lsmod and ps -ef after the second attempt to start X

comment:4 Changed 12 years ago by rbhkamal

Please note that the date on the guest machine was not set correctly but I just made these logs today.

comment:5 Changed 12 years ago by michael

  • Keywords vboxdrv removed

This looks like the same issue as ticket #5788. That one is closed as fixed, but the "solution" may also have been that the updated X server hid the bug.

comment:6 Changed 12 years ago by rbhkamal

Hummm... this might explain why trying to start X again works. Seems like a race condition?

comment:7 Changed 12 years ago by michael

What seems strange to me is that the log looks like the server actually started successfully and stopped again. The segfault is in code executed during startup, which the log suggests has already been executed. I know that the server has or had a "generation" mechanism which implied it starting and stopping several times during the lifetime of the server process - perhaps that is involved here.

comment:8 Changed 12 years ago by rbhkamal

I'm not sure that I understand what you mean by "generation mechanism" but here is the life line of the guest OS:

Start VM
  \> execute startx
    ---- If the startx returns, then startx again.
    ---- If X fails three times in a row, halt/power off the VM.

Please see startgui.sh below

Changed 12 years ago by rbhkamal

This shell is start by another shell which is started by rc.local

comment:9 Changed 12 years ago by michael

Is there any way you can install debugging symbols for the server and get a backtrace in gdb? The automatic X server backtrace is nice, but not quite as good as a real one.

comment:10 Changed 12 years ago by michael

And the generation mechanism is something internal to the X server. It is a way of starting and stopping the server without ending the server process or reprobing all hardware, but I don't know anything more about it myself, and I am not even sure if it ever worked.

comment:12 Changed 12 years ago by michael

So based on that link this probably happens when the server terminates and automatically restarts without ending the server process.

comment:13 Changed 12 years ago by michael

Reproduced by starting the X server as plain

$ X

then starting an xterm on it (from a virtual terminal) and exiting it.

comment:14 Changed 12 years ago by michael

The faulting address looks to me like the line

            VGAHWPTR(pScrn)->IOBase = pScrn->domainIOBase;

in vboxvideo.c and VGAHWPTR(pScrn) is NULL.

comment:15 Changed 12 years ago by michael

We call vgaHWFreeHWRec in VBOXCloseScreen, which is called at the end of each server generation, but we call vgaHWGetHWRec to allocate the record in VBOXPreInit, which is called at the start of the first generation only. I will change this tomorrow and see if it fixes the issue.

comment:16 Changed 12 years ago by rbhkamal

I can test it as well, however, I can't find any instructions on how to install the opensource guest additions (self compiled) manually, right now I just run the installer from the additions ISO.

I'm also still trying to set things up with gdb, however, I'm having a hard time starting X using gdb on startup.

comment:17 Changed 12 years ago by michael

Could you try  this build?

which is a test build from the 4.0 stable branch (please see [wiki:Testbuilds here)?

comment:18 Changed 12 years ago by rbhkamal

Just to be sure, I'm doing everything correctly: 1- Install the build 2- Get the guestAdditions.ISO and upgrade the additions for the guest 3- test

comment:19 Changed 12 years ago by rbhkamal

Alright, I've launched the machine about 6 times and no crashes. However, I was only able to test it with the testbuild virtualbox installed on the host OS. If I try to test using VirtualBox 3.2.12 with the testbuild guest additions, the virtual machine crashes immediately when X is trying to start.
It seems like the problem is fixed, is it possible to give me a patch so I can try and patch the 3.2.12 guest addition. This way I can test it with minimum changes to the test bed.

Thanks

comment:20 Changed 12 years ago by michael

Here is an untested backport of the change to 3.2:

Index: src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c
===================================================================
--- src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c	(révision 69858)
+++ src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c	(copie de travail)
@@ -802,10 +802,6 @@
     /* Framebuffer-related setup */
     pScrn->bitmapBitOrder = BITMAP_BIT_ORDER;
 
-    /* VGA hardware initialisation */
-    if (!vgaHWGetHWRec(pScrn))
-        return FALSE;
-
 #ifdef VBOX_DRI
     /* Load the dri module. */
     if (!xf86LoadSubModule(pScrn, "dri"))
@@ -857,6 +853,10 @@
     VisualPtr visual;
     unsigned flags;
 
+    /* VGA hardware initialisation */
+    if (!vgaHWGetHWRec(pScrn))
+        return FALSE;
+
     if (pVBox->mapPhys == 0) {
 #ifdef PCIACCESS
         pVBox->mapPhys = pVBox->pciInfo->regions[0].base_addr;
Index: src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c
===================================================================
--- src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c	(révision 69858)
+++ src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c	(copie de travail)
@@ -637,10 +637,6 @@
     /* Framebuffer-related setup */
     pScrn->bitmapBitOrder = BITMAP_BIT_ORDER;
 
-    /* VGA hardware initialisation */
-    if (!vgaHWGetHWRec(pScrn))
-        return FALSE;
-
     TRACE_EXIT();
     return (TRUE);
 }
@@ -668,6 +664,11 @@
     unsigned flags;
 
     TRACE_ENTRY();
+
+    /* VGA hardware initialisation */
+    if (!vgaHWGetHWRec(pScrn))
+        return FALSE;
+
     /* We make use of the X11 VBE code to save and restore text mode, in
        order to keep our code simple. */
     if ((pVBox->pVbe = VBEExtendedInit(NULL, pVBox->pEnt->index,

comment:21 Changed 12 years ago by rbhkamal

Prefect! it works! Thank you so much! But if I may ask, how where you able to tell vgaHWGetHWRec(pScrn) was null?

comment:22 Changed 12 years ago by michael

  • Summary changed from X Segmentation fault on startx to X Segmentation fault on startx -> fixed on trunk as of 2011-02-04

Actually it was VGAHWPTR(pScrn) which was NULL. I was able to match the object code in vboxvideo_drv.so with the source, and VGAHWPTR(pScrn)->IOBase became VGAHWPTR(pScrn) + 0x30 - and the invalid access was at address 0x30. Then I realised that we were initialising that pointer (with vgaHWGetHWRec(pScrn)) at the start of the first server generation but uninitialising it (with vgaHWFreeHWRec(pScrn)) at the end of every generation.

I will commit the backport, so the fix will be present in any future 3.2 releases. Thanks for verifying it.

comment:23 Changed 12 years ago by frank

  • Status changed from new to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use