VirtualBox

Ticket #3720 (new defect)

Opened 5 years ago

Last modified 5 years ago

Recompiler doesn't check selector limits

Reported by: adrianmay Owned by:
Priority: major Component: VMM
Version: VirtualBox 2.0.4 Keywords: GP fault, GDT, segment overrun
Cc: Guest type: other
Host type: Linux

Description

Demo code at github.com/adrianmay/digilife master branch tag "vboxbug". Make it with 'make' and use zed.img as a boot floppy. On a real (single processor) PC this program reports a GP fault when main.c attempts to write to *0xffffffff although the data segment (GDT in bootsect.asm) is only 0xC0000 long. On VBox, the program sails on regardless. (There's no TSS and only a ring 0 task in this program.) I got similar effects with other peoples' OS tutorials. Code segments crash when expected.

Change History

comment:1 Changed 5 years ago by sandervl73

I'm pretty sure your code is being executed by the recompiler and that one doesn't check selector limits. Known problem and not so easy to fix.

Any code executed in software virtualization mode or with VT-x/AMD-V will fault for selector limit violations.

comment:2 Changed 5 years ago by sandervl73

And to clear up any confusion resulting from my previuos statement: the recompiler (QEmu) is only used for real mode, protected mode without paging and early booting with paging on. In other words only during early booting of any normal OS using software virtualization.

VT-x and AMD-V pretty much don't need the recompiler at all.

comment:3 Changed 5 years ago by adrianmay

Thanks for the very fast reply!

I need segments protected properly for my project, so how do I achieve that? I don't know what you mean by "software virtualization mode or with VT-x/AMD-V". Do I have to choose a menu option or something? Your diagnosis is probably right because I'm using protected mode without paging. Does this mean the problem will go away if I enable paging? What do you mean by "early" booting? How early? When does limit checking become active?

Adrian.

comment:4 Changed 5 years ago by sandervl73

  • Summary changed from Allows overrun of data segment to Recompiler doesn't check selector limits

If your CPU supports VT-x or AMD-V, then you can enable this option in the settings dialog of the VM. (as of 2.2.0 all new VMs have this feature enable if support is available)

Software virtualization (aka raw mode) is used when your CPU doesn't support VT-x/AMD-V. There are a bunch of conditions we check before going into raw mode. (see src/VBox/VMM/EM.cpp, emR3Reschedule)

comment:5 Changed 5 years ago by adrianmay

Indeed, enabling VT-x/AMD fixed it. But this is still a bug. It cost me about a week figuring out it was VBox's fault. Either the better mode should be switched on by default (you can detect if the HW supports it) or the QEmu mode should do limit checking, or both. Limit checking is a pretty basic feature of the architecture you're trying to emulate.

comment:6 Changed 5 years ago by frank

There is no need to blame or teach us. I can assure you that you will certainly spend some more weeks with other incompatibilities between a virtualized guest and real hardware.

It was already said that starting with VBox 2.2.0, the HW virtualization is enabled by default for all new VMs. Under some conditions it is still useful use the non-HW virtualization mode (e.g. sometimes better performance).

comment:7 Changed 5 years ago by adrianmay

Hey, there's no need to get shirty just because I'm expressing an opinion that would have saved me a lot of time. Correct me if what little I know about the 286 is wrong, but I believe limit checking is pretty fundamental to the Intel concept, isn't it? I was also interested in the P6 performance monitoring stuff but didn't assume you emulated something that exotic, but limit checking? C'mon.

Furthermore, upgrading to the latest VBox (2.2.0 r45846) is the first thing I did when I suspected it might be a VBox problem. And it didn't fix it. The checkbox was still off. I guess that's my fault because I reopened the old VM, but out of all your customers, I'm sure I'm the one with the least reason to have done that. I've only got a 25 sector toy OS - they've got productive machines with all their drivers, photo collections, networking, etc. But I guess totally new customers will get the benefit of the upgrade.

But don't let me lecture you, I'm just a user. I'm sure you've got everything figured out.

comment:8 Changed 5 years ago by sandervl73

It is fundamental, but like I said above, it's only doesn't work in certain edge cases for raw mode. You just happened to hit one.

comment:9 Changed 5 years ago by adrianmay

OK. But I'm still not quite clear about one thing: if I'm using an old (pre VT-x/AMD-V) hardware to run Windows or Linux and it's up and running and word processing or whatever, then is limit checking happening among the ring 3 apps or not?

If so, then I appreciate that mine is a pretty offbeat case. I still wonder how much of a debate ensued that day when some QEmu designer said "Who gives a damn about limit checking?" and how he could possibly have won it.

But my case is solved in record time thanks the phenomenal turnaround time on this support line.

comment:10 Changed 5 years ago by sandervl73

Limit checking is always done for ring 3 apps. In raw/software virtualization mode we run the guest OS on bare metal, so the CPU performs the checks.

I think the problem with limit checking isn't so much of 'who gives a damn', but rather that it's kind of expensive to check the selector limit for each memory access in software. I agree that it's a questionable decision.

comment:11 Changed 5 years ago by adrianmay

Hi again,

Seeing as you recognised the other case pretty fast, I thought I'd try and see if you can go straight to the heart of this one too, or maybe it needs a new ticket.

This time, I'm trying to have the keyboard interrupt handled by a task, so I've made TSSes for the interrupted (ring 3) task and the keyboard handler, and made a task gate pointing to the TSS for the handler, and put an identical task gate into the IDT. It works on a real PC (only for the first keypress but I know why.) In VirtualBox, I can jump to my task either straight to the TSS descriptor, or via the task gate in the GDT. Also, I can handle the same interrupt as a procedure, either in the same privilege level as the interrupted task or at ring 0 on the alternate stack. (Obviously, I need to add and remove INTRs and loops to switch between those options.) But if I press a key while the IDT entry contains the task gate it just freezes as soon as I press the key. No guru meditation, GP fault or anything. The code is the latest of the main branch of github.org/adrianmay/digilife.

Thanks in advance, Adrian.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use