VirtualBox

Ticket #8928 (new defect)

Opened 3 years ago

Last modified 7 months ago

Guru Meditation when accessing RAM (through mapping) @ 0xB8000

Reported by: mduft Owned by:
Priority: major Component: VMM
Version: VirtualBox 4.0.6 Keywords:
Cc: Guest type: other
Host type: Linux

Description

Hi!

I'm writing a little hobby kernel (sources @ [1]), and now try to run it on virtualbox. it works fine on qemu, bochs and real hardware. The kernel is a x86_64 kernel. It runs perfectly on VirtualBox, with one little exception: if i try to do something on the screen, a guru meditation results (log attached).

The code causing the crash maps the VGA memory 0xB8000 to a virtual memory location (which succeeds), and then tries to write to the buffer (which crashes).

as i said, this works on qemu, bochs and real hardware, only vbox crashes. is this a limitation i should be aware of?

i'll attach:

  • the log for the meditation.
  • a cd image containing the kernel booted by grub2.

[1]  https://github.com/mduft/tachyon3/blob/master/src/x86/cga.c

Attachments

VBox.log Download (163.5 KB) - added by mduft 3 years ago.
Log of guru meditation
0001-various-bug-fixes-regarding-paging-this-fixes-boot-i.patch.bz2 Download (1.9 KB) - added by mduft 3 years ago.
fixes for the kernel to make it work with virtualbox
VBox.log.bz2 Download (34.6 KB) - added by mduft 7 months ago.
New VBox.log for Guru Meditation

Change History

Changed 3 years ago by mduft

Log of guru meditation

comment:1 Changed 3 years ago by mduft

owh - the iso is a little too large (3MB) to be attached. here is a link:  http://dev.gentoo.org/~mduft/x86_64-tachyon-grub2.iso

and i forgot to mention: i configured the machine to be linux (64 bit) as i can't seem to be able to get virtualbox to virtualize a 64-bit environment (is a "other (64 bit)" option missing...?).

comment:2 Changed 3 years ago by michaln

You're setting bit 8 in the entries of all levels of the page tables, even where the bit is ignored, and I don't think you don't have CR4.PGE set anyway. Is that intentional? (Probably not the reason for the guru meditations, but suspicious.)

And yes, "other (64 bit)" is kind of missing, but a Linux guest type works just fine.

comment:3 Changed 3 years ago by mduft

bit 8 is the global bit... should that only be set on the actual page, not all the table in between? hmm.. maybe need to change this, however i don't think that it causes problems (ATM), as the rest of the kernel runs fine.

CR4.PGE of course should be enabled:

94 mov %cr4, %eax # again, load CR4 95 bts $CR4_GLOBAL_PAGES, %eax # must be done after setting CR0_PAGING! 96 mov %eax, %cr4 # and do it.

hah - thanks for pointing this out. the bts should have been an or, as CR4_GLOBAL_PAGES=(1<<7) :) thats fixed now, and verified in qemu and virtualbox (however the guru meditation still persists...).

i won't attach a new vbox.log, as only CR4 changed - just read the value as 0xa0 now, please ;)

comment:4 Changed 3 years ago by mduft

grr... should have been (formatted):

 94     mov %cr4, %eax              # again, load CR4
 95     bts $CR4_GLOBAL_PAGES, %eax  # must be done after setting CR0_PAGING!
 96     mov %eax, %cr4              # and do it.

comment:5 Changed 3 years ago by michaln

Yes, the global bit should only be set on actual pages (whatever the size), although that doesn't seem to be causing real trouble.

You probably did hit a bug in VirtualBox, but it's not yet clear what's going on. You are most likely transitioning between the various paging modes in some rather unusual way.

FYI, I'm not getting guru meditations; instead the guest hangs (probably should be triple faulting) with EIP=0x00101080. Same thing on two quite different systems. I assume this is right after the guest goes into long mode.

It *may* be a problem that you're executing code between enabling paging in CR0 and doing the far call which loads the 64-bit CS.

comment:6 Changed 3 years ago by michaln

More about the G bit - the AMD manuals indicate that setting the G bit in higher level page tables is invalid; AMD says the bit must be zero, while Intel says it's ignored. Take your pick as to what that really means.

Also, AMD states in their system programming manual, section 14.6.1, that the instruction which enables paging and thus activates long mode must be immediately followed by a branch instruction. You are violating that requirement. The comment in your code (boot.s) says that setting the LME bit in EFER turns on 64-bit compatibility mode, but that is not true - it's the write to CR0 which enables paging that activates 64-bit compatibility mode.

The behavior I'm seeing is different depending on whether nested paging is enabled or not on an Intel CPU. With nested paging off, the guest effectively triple faults immediately after enabling long mode - it's that instruction which should be a branch but isn't. The IDT is unusable at this point, so it's not surprising the guest would crash and burn. It looks like the problem is really the same whether nested paging is enabled or not, but without nested paging it hits earlier.

I still don't understand how you managed to confuse VirtualBox with regard to paging (the guru meditation you saw). You're definitely doing something differently compared to 64-bit Windows, Linux, Solaris, BSD, OS X, etc.

And finally, your GDT is not doubleword aligned. That's not a bug, but definitely not recommended.

comment:7 Changed 3 years ago by mduft

hah - thank you very much for pointing out those issues. i wasn't aware of them. fixing those two then also fixed the guru meditation. everything seems to work fine now. thanks!

comment:8 Changed 3 years ago by mduft

is it still worth analyzing what caused the guru meditation (instead of a tripple fault ... :))? otherwise this can be closed.

comment:9 Changed 3 years ago by michaln

Yes, we'd still like to understand what really happened.

In the meantime, I tried a host with AMD-V and nested paging. In that case, the behavior was different - AMD says the G bits in higher levels page tables must be zero, and a page fault is caused when they're not. So in that case, there is a deadly page fault immediately after turning on paging, since the next instruction cannot be executed. The guest may die in different ways, but it certainly isn't expected to run.

But for Intel it's different, because Intel defines the bits as ignored...

Anyway, which changes exactly made it work on VirtualBox? And do you have an updated ISO we could get?

Changed 3 years ago by mduft

fixes for the kernel to make it work with virtualbox

comment:10 Changed 3 years ago by mduft

you can find an updated iso with the kernel that boots correctly (and behaves correctly, although it doesn't live very long :)) here:  http://dev.gentoo.org/~mduft/x86_64-tachyon-grub2-new.iso

(the old one is still there, if you need it)

comment:11 Changed 7 months ago by mduft

Hey!

Now, after 2 years, i tried again with virtualbox. And again a Guru Meditation, but a different one. This time i'm seeing a "Guru Meditation -1153 (VERR_EM_INTERNAL_DISAS_ERROR)". I have uploaded a new ISO with my test kernel. It runs fine on QEMU, Bochs, VMWare and real Hardware. Any hint on what is wrong would be appreciated.

 http://dev.gentoo.org/~mduft/x86_64-tachyon-grub2.iso

Changed 7 months ago by mduft

New VBox.log for Guru Meditation

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use