[vbox-dev] VM crash, NS_ERROR_FAILURE

Klaus Espenlaub klaus.espenlaub at oracle.com
Fri Apr 1 15:50:48 GMT 2016


Hi Ric,

On 30.03.2016 02:02, Vilbig, Ric wrote:
>
> Hi,
>
> I obviously carried on with my investigation after sending the 
> original email, and have figured out what is triggering this abort 
> (not really fair to call it a crash).
>
VBox.log actually is showing that the VM was never fully powered up. So 
the crash happens before the CPU started executing instructions. See 
below, I know that this doesn't make much sense to you.
>
> When the BIOS starts initializing the PCI Configuration space for my 
> PCIe switch, it reads the secondary bus register (PCI CFG 0x19) before 
> it’s been initialized, so the device model is returning 0.  This puts 
> the BIOS into a loop, repeating the following over 5000 times before 
> aborting the VM session.
>
> PCI CFG Root  Rd 0x0a L 2 = 0x0604     // Class
>
> PCI CFG Root  Rd 0x00 L 2 = 0x14ab     // VendID
>
> PCI CFG Root  Rd 0x02 L 2 = 0x1000     // DevID
>
> PCI CFG Root  Wr 0x1c L 1 = 0xd0       // IOBase
>
> PCI CFG Root  Wr 0x20 L 2 = 0xf000     // MemBase
>
> PCI CFG Root  Rd 0x19 L 1 = *0x00*       // SecBus
>
> If I intercept the secondary bus register read, and return a 3 instead 
> of reading 0 from RTL, then it carries on with root configuration and 
> my VM boots and runs correctly.  It’s not detecting the downstream end 
> point, but that is a separate issue.
>
> Meanwhile, does it make sense for the BIOS to read the secondary bus 
> register before it’s been initialized?  It seems like that register 
> should be set up as the BIOS proceeds through the enumeration.  That 
> is what the VM with PIIX3 chipset does.
>
It does, but for a non-obvious reason. VirtualBox pre-configures its PCI 
devices before it starts the BIOS, especially the bus numbers. Looks 
like for some reason this isn't done properly (or not making it 
correctly to your PCIe switch). This confuses the code, most likely 
causing endless recursion and thus a stack overflow. You should be able 
to use a debugger on the VM process to find out the detail, because this 
is all normal userland code on the host - which wouldn't work if it's 
BIOS code running inside the VM.

The motivation for moving the PCI bus configuration out of the BIOS is 
to some extent historic (in the old days we always fought with the BIOS 
size restriction, due to the extremely bad code quality by the BCC 
compiler), to some extent an optimization (it's far easier and more 
efficient to do the hairy stuff in 32 bit code on the host, and not in 
in the actual BIOS, which is annoying 16 bit real mode code).

Klaus
>
> _____________________________________________
>
> **
>
> *Ric Vilbig*
>
> Mentor Graphics, Emulation Division
>
> 46871 Bayside Parkway, Fremont CA, 94538
>
> Phone:  510-354-7360
>
> Mobile: 408-529-2365
>
> email: ric_vilbig at mentor.com <mailto:ric_vilbig at mentor.com>
>
> *From:*Vilbig, Ric
> *Sent:* Tuesday, March 29, 2016 11:40
> *To:* vbox-dev at virtualbox.org
> *Cc:* Vilbig, Ric
> *Subject:* VM crash, NS_ERROR_FAILURE
>
> Hi experts,
>
> I would like to ask for some help to figure out why a certain VM 
> crashes on start-up.  Although the problem is evidently induced by my 
> PDM plug-in, the crash does not appear to be happening therein.  I 
> need some help to root cause where VBox is aborting the VM session.
>
> >  VBoxManage startvm "U14_ICH9_2"
>
> Waiting for VM "U14_ICH9_2" to power on...
>
> VBoxManage: error: The VM session was aborted
>
> VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), 
> component SessionMachine, interface ISession
>
> I created this VM from the VirtualBox GUI, v5.0.16, which I built from 
> the tarball at https://www.virtualbox.org/wiki/Downloads and am 
> running on an Ubuntu 14 host.  Then I switched the chipset to ICH9, 
> then I installed Ubuntu 14 as the guest. The VM runs well, until I 
> plug my virtual device model into PDM (it’s a PCIe switch with 
> downstream endpoint). After plugging in my virtual device, the VM 
> crashes as shown above.
>
> I tracked down everywhere NS_ERROR_FAILURE is mentioned in the 
> sources.  I found that DirectoryServiceProvider::GetFile() returns 
> that error twice, right away, but that is also true in the working 
> case when my device is unplugged.  In no other place is that specific 
> error ever returned or asserted.  However, I found that E_FAIL is 
> #defined to NS_ERROR_FAILURE, and there are hundreds of references to 
> E_FAIL, so I gave up trying to instrument them all.
>
> I need some help to root cause this problem.  Log files show that it 
> is getting as far as BIOS starting to initialize the switch, 
> apparently stuck in a loop doing that, but then lights out with no 
> trail that I can follow.
>
> Log files are attached.  Lines bearing the “RicV” prefix were 
> instrumented by me to investigate this problem.  Lines bearing the 
> “RemDev” prefix are coming from my PDM plug-in.
>
> Thanks,
>
> _____________________________________________
>
> **
>
> *Ric Vilbig*
>
> Mentor Graphics, Emulation Division
>
> 46871 Bayside Parkway, Fremont CA, 94538
>
> Phone: 510-354-7360
>
> Mobile: 408-529-2365
>
> email: ric_vilbig at mentor.com <mailto:ric_vilbig at mentor.com>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.virtualbox.org/pipermail/vbox-dev/attachments/20160401/a6d4280a/attachment.html>


More information about the vbox-dev mailing list