VirtualBox

Changeset 5017

Show
Ignore:
Timestamp:
09/24/07 23:38:20 (1 year ago)
Author:
vboxsync
Message:

GVM kick-off.

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • trunk/src/VBox/VMM/PGM.cpp

    r4738 r5017  
    108108 
    109109/** @page pg_pgmPhys PGMPhys - Physical Guest Memory Management. 
    110  *  
    111  *  
     110 * 
     111 * 
    112112 * Objectives: 
    113  *      - Guest RAM over-commitment using memory ballooning,  
     113 *      - Guest RAM over-commitment using memory ballooning, 
    114114 *        zero pages and general page sharing. 
    115115 *      - Moving or mirroring a VM onto a different physical machine. 
    116116 * 
    117  *  
     117 * 
    118118 * @subsection subsec_pgmPhys_Definitions       Definitions 
    119  *  
    120  * Allocation chunk - A RTR0MemObjAllocPhysNC object and the tracking  
    121  * machinery assoicated with it.  
    122  *  
    123  *  
    124  *  
    125  *  
     119 * 
     120 * Allocation chunk - A RTR0MemObjAllocPhysNC object and the tracking 
     121 * machinery assoicated with it. 
     122 * 
     123 * 
     124 * 
     125 * 
    126126 * @subsection subsec_pgmPhys_AllocPage         Allocating a page. 
    127  *  
    128  * Initially we map *all* guest memory to the (per VM) zero page, which  
     127 * 
     128 * Initially we map *all* guest memory to the (per VM) zero page, which 
    129129 * means that none of the read functions will cause pages to be allocated. 
    130  *  
     130 * 
    131131 * Exception, access bit in page tables that have been shared. This must 
    132  * be handled, but we must also make sure PGMGst*Modify doesn't make  
     132 * be handled, but we must also make sure PGMGst*Modify doesn't make 
    133133 * unnecessary modifications. 
    134  *  
     134 * 
    135135 * Allocation points: 
    136136 *      - PGMPhysWriteGCPhys and PGMPhysWrite. 
     
    139139 *      - ROM registration (currently MMR3RomRegister). 
    140140 *      - VM restore (pgmR3Load). 
    141  *  
    142  * For the first three it would make sense to keep a few pages handy  
    143  * until we've reached the max memory commitment for the VM.  
    144  *  
    145  * For the ROM registration, we know exactly how many pages we need  
    146  * and will request these from ring-0. For restore, we will save  
     141 * 
     142 * For the first three it would make sense to keep a few pages handy 
     143 * until we've reached the max memory commitment for the VM. 
     144 * 
     145 * For the ROM registration, we know exactly how many pages we need 
     146 * and will request these from ring-0. For restore, we will save 
    147147 * the number of non-zero pages in the saved state and allocate 
    148148 * them up front. This would allow the ring-0 component to refuse 
    149149 * the request if the isn't sufficient memory available for VM use. 
    150  *  
    151  * Btw. for both ROM and restore allocations we won't be requiring  
     150 * 
     151 * Btw. for both ROM and restore allocations we won't be requiring 
    152152 * zeroed pages as they are going to be filled instantly. 
    153153 * 
    154154 * 
    155155 * @subsection subsec_pgmPhys_FreePage          Freeing a page 
    156  *  
     156 * 
    157157 * There are a few points where a page can be freed: 
    158158 *      - After being replaced by the zero page. 
     
    161161 *      - At reset. 
    162162 *      - At restore. 
    163  *  
     163 * 
    164164 * When freeing one or more pages they will be returned to the ring-0 
    165165 * component and replaced by the zero page. 
     
    167167 * The reasoning for clearing out all the pages on reset is that it will 
    168168 * return us to the exact same state as on power on, and may thereby help 
    169  * us reduce the memory load on the system. Further it might have a  
     169 * us reduce the memory load on the system. Further it might have a 
    170170 * (temporary) positive influence on memory fragmentation (@see subsec_pgmPhys_Fragmentation). 
    171  *  
    172  * On restore, as mention under the allocation topic, pages should be  
     171 * 
     172 * On restore, as mention under the allocation topic, pages should be 
    173173 * freed / allocated depending on how many is actually required by the 
    174174 * new VM state. The simplest approach is to do like on reset, and free 
    175175 * all non-ROM pages and then allocate what we need. 
    176  *  
     176 * 
    177177 * A measure to prevent some fragmentation, would be to let each allocation 
    178178 * chunk have some affinity towards the VM having allocated the most pages 
    179179 * from it. Also, try make sure to allocate from allocation chunks that 
    180180 * are almost full. Admittedly, both these measures might work counter to 
    181  * our intentions and its probably not worth putting a lot of effort,  
     181 * our intentions and its probably not worth putting a lot of effort, 
    182182 * cpu time or memory into this. 
    183  *  
    184  *  
     183 * 
     184 * 
    185185 * @subsection subsec_pgmPhys_SharePage         Sharing a page 
    186  *  
    187  * The basic idea is that there there will be a idle priority kernel  
    188  * thread walking the non-shared VM pages hashing them and looking for  
    189  * pages with the same checksum. If such pages are found, it will compare  
     186 * 
     187 * The basic idea is that there there will be a idle priority kernel 
     188 * thread walking the non-shared VM pages hashing them and looking for 
     189 * pages with the same checksum. If such pages are found, it will compare 
    190190 * them byte-by-byte to see if they actually are identical. If found to be 
    191  * identical it will allocate a shared page, copy the content, check that  
     191 * identical it will allocate a shared page, copy the content, check that 
    192192 * the page didn't change while doing this, and finally request both the 
    193  * VMs to use the shared page instead. If the page is all zeros (special  
    194  * checksum and byte-by-byte check) it will request the VM that owns it  
     193 * VMs to use the shared page instead. If the page is all zeros (special 
     194 * checksum and byte-by-byte check) it will request the VM that owns it 
    195195 * to replace it with the zero page. 
    196  *  
     196 * 
    197197 * To make this efficient, we will have to make sure not to try share a page 
    198198 * that will change its contents soon. This part requires the most work. 
    199  * A simple idea would be to request the VM to write monitor the page for  
     199 * A simple idea would be to request the VM to write monitor the page for 
    200200 * a while to make sure it isn't modified any time soon. Also, it may 
    201201 * make sense to skip pages that are being write monitored since this 
    202  * information is readily available to the thread if it works on the  
     202 * information is readily available to the thread if it works on the 
    203203 * per-VM guest memory structures (presently called PGMRAMRANGE). 
    204  *  
    205  *  
     204 * 
     205 * 
    206206 * @subsection subsec_pgmPhys_Fragmentation     Fragmentation Concerns and Counter Measures 
    207  *  
     207 * 
    208208 * The pages are organized in allocation chunks in ring-0, this is a necessity 
    209  * if we wish to have an OS agnostic approach to this whole thing. (On Linux we  
     209 * if we wish to have an OS agnostic approach to this whole thing. (On Linux we 
    210210 * could easily work on a page-by-page basis if we liked. Whether this is possible 
    211  * or efficient on NT I don't quite know.) Fragmentation within these chunks may  
     211 * or efficient on NT I don't quite know.) Fragmentation within these chunks may 
    212212 * become a problem as part of the idea here is that we wish to return memory to 
    213  * the host system.  
    214  *  
     213 * the host system. 
     214 * 
    215215 * For instance, starting two VMs at the same time, they will both allocate the 
    216  * guest memory on-demand and if permitted their page allocations will be  
    217  * intermixed. Shut down one of the two VMs and it will be difficult to return  
    218  * any memory to the host system because the page allocation for the two VMs are  
     216 * guest memory on-demand and if permitted their page allocations will be 
     217 * intermixed. Shut down one of the two VMs and it will be difficult to return 
     218 * any memory to the host system because the page allocation for the two VMs are 
    219219 * mixed up in the same allocation chunks. 
    220  *  
    221  * To further complicate matters, when pages are freed because they have been  
     220 * 
     221 * To further complicate matters, when pages are freed because they have been 
    222222 * ballooned or become shared/zero the whole idea is that the page is supposed 
    223223 * to be reused by another VM or returned to the host system. This will cause 
    224224 * allocation chunks to contain pages belonging to different VMs and prevent 
    225225 * returning memory to the host when one of those VM shuts down. 
    226  *  
    227  * The only way to really deal with this problem is to move pages. This can  
    228  * either be done at VM shutdown and or by the idle priority worker thread  
     226 * 
     227 * The only way to really deal with this problem is to move pages. This can 
     228 * either be done at VM shutdown and or by the idle priority worker thread 
    229229 * that will be responsible for finding sharable/zero pages. The mechanisms 
    230  * involved for coercing a VM to move a page (or to do it for it) will be  
     230 * involved for coercing a VM to move a page (or to do it for it) will be 
    231231 * the same as when telling it to share/zero a page. 
    232232 * 
    233  *  
     233 * 
    234234 * @subsection subsec_pgmPhys_Tracking      Tracking Structures And Their Cost 
    235  *  
    236  * There's a difficult balance between keeping the per-page tracking structures  
    237  * (global and guest page) easy to use and keeping them from eating too much  
     235 * 
     236 * There's a difficult balance between keeping the per-page tracking structures 
     237 * (global and guest page) easy to use and keeping them from eating too much 
    238238 * memory. We have limited virtual memory resources available when operating in 
    239  * 32-bit kernel space (on 64-bit there'll it's quite a different story). The  
     239 * 32-bit kernel space (on 64-bit there'll it's quite a different story). The 
    240240 * tracking structures will be attemted designed such that we can deal with up 
    241241 * to 32GB of memory on a 32-bit system and essentially unlimited on 64-bit ones. 
    242  *  
    243  *  
     242 * 
     243 * 
    244244 * @subsubsection subsubsec_pgmPhys_Tracking_Kernel     Kernel Space 
    245  *  
    246  * The allocation chunks are of fixed sized, the size defined at build time.  
    247  * Each chunk is given an unquie ID. Each page can be addressed by 
    248  * (idChunk << CHUNK_SHIFT) | iPage, where CHUNK_SHIFT is log2(cbChunk / PAGE_SIZE).  
    249  * Meaning that each page have an unique ID, a sort of virtual page frame number 
    250  * if you like, so that a page can be referenced to in an efficient manner.  
    251  * No surprise, the allocation chunks are organized in an AVL tree with  
    252  * their IDs being the key. 
    253  *  
    254  * The physical address of each page in an allocation chunk is maintained by  
    255  * the RTR0MEMOBJ and obtained using RTR0MemObjGetPagePhysAddr. There is no  
    256  * need to duplicate this information unnecessarily.  
    257  *  
    258  * We wish to maintain a reference to the VM owning the page. For the purposes 
    259  * of defragmenting allocation chunks, it would make sense to keep track of  
    260  * which page within the VM that it's being used as, although this will  
    261  * obviously make the handy pages a wee more work to realize. For shared  
    262  * pages we need a reference count so we know when to free the page. But tracking 
    263  * which VMs using shared pages will be too complicated and expensive, so we'll 
    264  * just forget about it. And finally, free pages needs to be chained somehow, 
    265  * so we can do allocations in an efficient manner. 
    266  *  
    267  * Putting shared pages in dedicated allocation chunks will simplify matters 
    268  * quite a bit. It will more or less eliminate the problem with defragmenting 
    269  * shared pages, but arranging it so that we will never encounter shared pages 
    270  * and normal pages in the same allocation chunks. And it will I think permit 
    271  * us to get away with a 32-bit field for each page. 
    272  *  
    273  * We'll chain the free pages using this field to indicate the index of the  
    274  * next page. (I'm undecided whether this chain should be on a per-chunk  
    275  * level or not, it depends a bit on whether it's desirable to keep chunks 
    276  * with free pages in a priority list by free page count (ascending) in order  
    277  * to maximize the number of full chunks.) In any case, there'll be two free  
    278  * lists, one for shared pages and one for normal pages. 
    279  *  
    280  * Shared pages that have been allocated will use the 32-bit field for keeping 
    281  * the reference counter. 
    282  *  
    283  * Normal pages that have been allocated will use the first 24 bits for guest 
    284  * page frame number (i.e. shift by PAGE_SHIFT and you'll have the physical  
    285  * address, all 24-bit set means unknown or out of range). The top 8 bits will 
    286  * be used as VM handle index - we assign each VM a unique handle [0..255] for 
    287  * this purpose. This implies a max of 256 VMs and 64GB of base RAM per VM.  
    288  * Neither limits should cause any trouble for the time being. 
    289  *  
    290  * The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ  
    291  * entails. In addition there is the chunk cost of approximately 
    292  * (sizeof(RT0MEMOBJ) + sizof(CHUNK)) / 2^CHUNK_SHIFT bytes per page. 
    293  *  
    294  * On Windows the per page RTR0MEMOBJ cost is 32-bit on 32-bit windows  
    295  * and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page. 
    296  * The cost on Linux is identical, but here it's because of sizeof(struct page *). 
    297  *  
     245 * 
     246 * @see pg_GMM 
    298247 * 
    299248 * @subsubsection subsubsec_pgmPhys_Tracking_PerVM      Per-VM 
    300  *  
    301  * Fixed info is the physical address of the page (HCPhys) and the page id  
     249 * 
     250 * Fixed info is the physical address of the page (HCPhys) and the page id 
    302251 * (described above). Theoretically we'll need 48(-12) bits for the HCPhys part. 
    303252 * Today we've restricting ourselves to 40(-12) bits because this is the current 
    304  * restrictions of all AMD64 implementations (I think Barcelona will up this  
    305  * to 48(-12) bits, not that it really matters) and I needed the bits for  
     253 * restrictions of all AMD64 implementations (I think Barcelona will up this 
     254 * to 48(-12) bits, not that it really matters) and I needed the bits for 
    306255 * tracking mappings of a page. 48-12 = 36. That leaves 28 bits, which means a 
    307256 * decent range for the page id: 2^(28+12) = 1024TB. 
    308  *  
    309  * In additions to these, we'll have to keep maintaining the page flags as we  
    310  * currently do. Although it wouldn't harm to optimize these quite a bit, like  
     257 * 
     258 * In additions to these, we'll have to keep maintaining the page flags as we 
     259 * currently do. Although it wouldn't harm to optimize these quite a bit, like 
    311260 * for instance the ROM shouldn't depend on having a write handler installed 
    312261 * in order for it to become read-only. A RO/RW bit should be considered so 
    313  * that the page syncing code doesn't have to mess about checking multiple  
     262 * that the page syncing code doesn't have to mess about checking multiple 
    314263 * flag combinations (ROM || RW handler || write monitored) in order to 
    315  * figure out how to setup a shadow PTE. But this of course, is second  
     264 * figure out how to setup a shadow PTE. But this of course, is second 
    316265 * priority at present. Current this requires 12 bits, but could probably 
    317266 * be optimized to ~8. 
    318  *  
    319  * Then there's the 24 bits used to track which shadow page tables are  
    320  * currently mapping a page for the purpose of speeding up physical  
    321  * access handlers, and thereby the page pool cache. More bit for this  
     267 * 
     268 * Then there's the 24 bits used to track which shadow page tables are 
     269 * currently mapping a page for the purpose of speeding up physical 
     270 * access handlers, and thereby the page pool cache. More bit for this 
    322271 * purpose wouldn't hurt IIRC. 
    323  *  
     272 * 
    324273 * Then there is a new bit in which we need to record what kind of page 
    325  * this is, shared, zero, normal or write-monitored-normal. This'll  
    326  * require 2 bits. One bit might be needed for indicating whether a  
     274 * this is, shared, zero, normal or write-monitored-normal. This'll 
     275 * require 2 bits. One bit might be needed for indicating whether a 
    327276 * write monitored page has been written to. And yet another one or 
    328277 * two for tracking migration status. 3-4 bits total then. 
    329  *  
     278 * 
    330279 * Whatever is left will can be used to record the sharabilitiy of a 
    331280 * page. The page checksum will not be stored in the per-VM table as 
    332  * the idle thread will not be permitted to do modifications to it.  
     281 * the idle thread will not be permitted to do modifications to it. 
    333282 * It will instead have to keep its own working set of potentially 
    334283 * shareable pages and their check sums and stuff. 
    335  *  
    336  * For the present we'll keep the current packing of the  
     284 * 
     285 * For the present we'll keep the current packing of the 
    337286 * PGMRAMRANGE::aHCPhys to keep the changes simple, only of course, 
    338  * we'll have to change it to a struct with a total of 128-bits at  
     287 * we'll have to change it to a struct with a total of 128-bits at 
    339288 * our disposal. 
    340  *  
     289 * 
    341290 * The initial layout will be like this: 
    342291 * @verbatim 
    343     RTHCPHYS HCPhys;            The current stuff.        
     292    RTHCPHYS HCPhys;            The current stuff. 
    344293        63:40                   Current shadow PT tracking stuff. 
    345294        39:12                   The physical page frame number. 
     
    351300    uint32_t u32Reserved;       Reserved for later, mostly sharing stats. 
    352301 @endverbatim 
    353  *  
     302 * 
    354303 * The final layout will be something like this: 
    355304 * @verbatim 
    356     RTHCPHYS HCPhys;            The current stuff.        
     305    RTHCPHYS HCPhys;            The current stuff. 
    357306        63:48                   High page id (12+). 
    358307        47:12                   The physical page frame number. 
     
    367316    uint32_t u32Tracking;       The shadow PT tracking stuff, roughly. 
    368317 @endverbatim 
    369  *  
    370  * Cost wise, this means we'll double the cost for guest memory. There isn't anyway  
     318 * 
     319 * Cost wise, this means we'll double the cost for guest memory. There isn't anyway 
    371320 * around that I'm afraid. It means that the cost of dealing out 32GB of memory 
    372  * to one or more VMs is: (32GB >> PAGE_SHIFT) * 16 bytes, or 128MBs. Or another  
     321 * to one or more VMs is: (32GB >> PAGE_SHIFT) * 16 bytes, or 128MBs. Or another 
    373322 * example, the VM heap cost when assigning 1GB to a VM will be: 4MB. 
    374  *  
    375  * A couple of cost examples for the total cost per-VM + kernel.  
     323 * 
     324 * A couple of cost examples for the total cost per-VM + kernel. 
    376325 * 32-bit Windows and 32-bit linux: 
    377326 *      1GB guest ram, 256K pages:  4MB +  2MB(+) =   6MB 
     
    382331 *      4GB guest ram, 1M pages:   16MB + 12MB(+) =  28MB 
    383332 *     32GB guest ram, 8M pages:  128MB + 96MB(+) = 224MB 
    384  *  
    385  *  
     333 * 
     334 * 
    386335 * @subsection subsec_pgmPhys_Serializing       Serializing Access 
    387  *  
     336 * 
    388337 * Initially, we'll try a simple scheme: 
    389  *  
    390  *      - The per-VM RAM tracking structures (PGMRAMRANGE) is only modified  
     338 * 
     339 *      - The per-VM RAM tracking structures (PGMRAMRANGE) is only modified 
    391340 *        by the EMT thread of that VM while in the pgm critsect. 
    392341 *      - Other threads in the VM process that needs to make reliable use of 
     
    397346 *        data when performing it tasks as the EMT thread will be the one to 
    398347 *        do the actual changes later anyway. So, as long as it only accesses 
    399  *        the main ram range, it can do so by somehow preventing the VM from  
    400  *        being destroyed while it works on it...  
    401  *  
     348 *        the main ram range, it can do so by somehow preventing the VM from 
     349 *        being destroyed while it works on it... 
     350 * 
    402351 *      - The over-commitment management, including the allocating/freeing 
    403352 *        chunks, is serialized by a ring-0 mutex lock (a fast one since the 
    404353 *        more mundane mutex implementation is broken on Linux). 
    405  *      - A separeate mutex is protecting the set of allocation chunks so  
    406  *        that pages can be shared or/and freed up while some other VM is  
    407  *        allocating more chunks. This mutex can be take from under the other  
     354 *      - A separeate mutex is protecting the set of allocation chunks so 
     355 *        that pages can be shared or/and freed up while some other VM is 
     356 *        allocating more chunks. This mutex can be take from under the other 
    408357 *        one, but not the otherway around. 
    409  *  
    410  *  
     358 * 
     359 * 
    411360 * @subsection subsec_pgmPhys_Request           VM Request interface 
    412  *  
     361 * 
    413362 * When in ring-0 it will become necessary to send requests to a VM so it can 
    414363 * for instance move a page while defragmenting during VM destroy. The idle 
    415  * thread will make use of this interface to request VMs to setup shared  
     364 * thread will make use of this interface to request VMs to setup shared 
    416365 * pages and to perform write monitoring of pages. 
    417  *  
    418  * I would propose an interface similar to the current VMReq interface, similar  
    419  * in that it doesn't require locking and that the one sending the request may  
    420  * wait for completion if it wishes to. This shouldn't be very difficult to  
     366 * 
     367 * I would propose an interface similar to the current VMReq interface, similar 
     368 * in that it doesn't require locking and that the one sending the request may 
     369 * wait for completion if it wishes to. This shouldn't be very difficult to 
    421370 * realize. 
    422371 * 
     
    426375 *      -# Update all shadow page tables involved with the page. 
    427376 * 
    428  * The 3rd step is identical to what we're already doing when updating a  
     377 * The 3rd step is identical to what we're already doing when updating a 
    429378 * physical handler, see pgmHandlerPhysicalSetRamFlagsAndFlushShadowPTs. 
    430  *  
    431  *  
    432  *  
     379 * 
     380 * 
     381 * 
    433382 * @section sec_pgmPhys_MappingCaches   Mapping Caches 
    434  *  
    435  * In order to be able to map in and out memory and to be able to support  
     383 * 
     384 * In order to be able to map in and out memory and to be able to support 
    436385 * guest with more RAM than we've got virtual address space, we'll employing 
    437386 * a mapping cache. There is already a tiny one for GC (see PGMGCDynMapGCPageEx) 
    438387 * and we'll create a similar one for ring-0 unless we decide to setup a dedicate 
    439388 * memory context for the HWACCM execution. 
    440  *  
    441  *  
     389 * 
     390 * 
    442391 * @subsection subsec_pgmPhys_MappingCaches_R3  Ring-3 
    443  *  
     392 * 
    444393 * We've considered implementing the ring-3 mapping cache page based but found 
    445  * that this was bother some when one had to take into account TLBs+SMP and  
    446  * portability (missing the necessary APIs on several platforms). There were  
    447  * also some performance concerns with this approach which hadn't quite been  
     394 * that this was bother some when one had to take into account TLBs+SMP and 
     395 * portability (missing the necessary APIs on several platforms). There were 
     396 * also some performance concerns with this approach which hadn't quite been 
    448397 * worked out. 
    449  *  
     398 * 
    450399 * Instead, we'll be mapping allocation chunks into the VM process. This simplifies 
    451  * matters greatly quite a bit since we don't need to invent any new ring-0 stuff,  
     400 * matters greatly quite a bit since we don't need to invent any new ring-0 stuff, 
    452401 * only some minor RTR0MEMOBJ mapping stuff. The main concern here is that mapping 
    453402 * compared to the previous idea is that mapping or unmapping a 1MB chunk is more 
    454  * costly than a single page, although how much more costly is uncertain. We'll  
     403 * costly than a single page, although how much more costly is uncertain. We'll 
    455404 * try address this by using a very big cache, preferably bigger than the actual 
    456  * VM RAM size if possible. The current VM RAM sizes should give some idea for  
    457  * 32-bit boxes, while on 64-bit we can probably get away with employing an  
     405 * VM RAM size if possible. The current VM RAM sizes should give some idea for 
     406 * 32-bit boxes, while on 64-bit we can probably get away with employing an 
    458407 * unlimited cache. 
    459408 * 
    460409 * The cache have to parts, as already indicated, the ring-3 side and the 
    461  * ring-0 side.  
    462  *  
    463  * The ring-0 will be tied to the page allocator since it will operate on the  
    464  * memory objects it contains. It will therefore require the first ring-0 mutex  
     410 * ring-0 side. 
     411 * 
     412 * The ring-0 will be tied to the page allocator since it will operate on the 
     413 * memory objects it contains. It will therefore require the first ring-0 mutex 
    465414 * discussed in @ref subsec_pgmPhys_Serializing. We 
    466  * some double house keeping wrt to who has mapped what I think, since both  
     415 * some double house keeping wrt to who has mapped what I think, since both 
    467416 * VMMR0.r0 and RTR0MemObj will keep track of mapping relataions 
    468  *  
    469  * The ring-3 part will be protected by the pgm critsect. For simplicity, we'll  
    470  * require anyone that desires to do changes to the mapping cache to do that  
    471  * from within this critsect. Alternatively, we could employ a separate critsect  
     417 * 
     418 * The ring-3 part will be protected by the pgm critsect. For simplicity, we'll 
     419 * require anyone that desires to do changes to the mapping cache to do that 
     420 * from within this critsect. Alternatively, we could employ a separate critsect 
    472421 * for serializing changes to the mapping cache as this would reduce potential 
    473422 * contention with other threads accessing mappings unrelated to the changes 
     
    475424 * up in the statistics anyway, so it'll be simple to tell. 
    476425 * 
    477  * The organization of the ring-3 part will be very much like how the allocation  
     426 * The organization of the ring-3 part will be very much like how the allocation 
    478427 * chunks are organized in ring-0, that is in an AVL tree by chunk id. To avoid 
    479428 * having to walk the tree all the time, we'll have a couple of lookaside entries 
     
    486435 *      -# Check the lookaside entries and then the AVL tree for the Chunk ID. 
    487436 *         If not found in cache: 
    488  *              -# Call ring-0 and request it to be mapped and supply  
     437 *              -# Call ring-0 and request it to be mapped and supply 
    489438 *                 a chunk to be unmapped if the cache is maxed out already. 
    490  *              -# Insert the new mapping into the AVL tree (id + R3 address).  
     439 *              -# Insert the new mapping into the AVL tree (id + R3 address). 
    491440 *      -# Update the relevant lookaside entry and return the mapping address. 
    492441 *      -# Do the read/write according to monitoring flags and everything. 
    493442 *      -# Leave the critsect. 
    494443 * 
    495  *  
     444 * 
    496445 * @section sec_pgmPhys_Fallback            Fallback 
    497  *  
     446 * 
    498447 * Current all the "second tier" hosts will not support the RTR0MemObjAllocPhysNC 
    499448 * API and thus require a fallback. 
    500  *  
     449 * 
    501450 * So, when RTR0MemObjAllocPhysNC returns VERR_NOT_SUPPORTED the page allocator 
    502451 * will return to the ring-3 caller (and later ring-0) and asking it to seed 
    503452 * the page allocator with some fresh pages (VERR_GMM_SEED_ME). Ring-3 will 
    504  * then perform an SUPPageAlloc(cbChunk >> PAGE_SHIFT) call and make a  
     453 * then perform an SUPPageAlloc(cbChunk >> PAGE_SHIFT) call and make a 
    505454 * "SeededAllocPages" call to ring-0. 
    506  *  
     455 * 
    507456 * The first time ring-0 sees the VERR_NOT_SUPPORTED failure it will disable 
    508457 * all page sharing (zero page detection will continue). It will also force 
    509  * all allocations to come from the VM which seeded the page. Both these  
     458 * all allocations to come from the VM which seeded the page. Both these 
    510459 * measures are taken to make sure that there will never be any need for 
    511460 * mapping anything into ring-3 - everything will be mapped already. 
    512461 * 
    513  * Whether we'll continue to use the current MM locked memory management  
     462 * Whether we'll continue to use the current MM locked memory management 
    514463 * for this I don't quite know (I'd prefer not to and just ditch that all 
    515464 * togther), we'll see what's simplest to do. 
    516  *  
    517  *  
    518  *  
     465 * 
     466 * 
     467 * 
    519468 * @section sec_pgmPhys_Changes             Changes 
    520  *  
     469 * 
    521470 * Breakdown of the changes involved? 
    522471 */ 
     
    925874        return rc; 
    926875 
    927     /*  
     876    /* 
    928877     * Initialize the PGM critical section and flush the phys TLBs 
    929878     */ 
     
    20461995                    pRam->GCPhys, pRam->GCPhysLast, pRam->cb, pRam->pvHC ? "bits" : "nobits", 
    20471996                    GCPhys, GCPhysLast, cb, fHaveBits ? "bits" : "nobits")); 
    2048             /*  
    2049              * If we're loading a state for debugging purpose, don't make a fuss if  
     1997            /* 
     1998             * If we're loading a state for debugging purpose, don't make a fuss if 
    20501999             * the MMIO[2] and ROM stuff isn't 100% right, just skip the mismatches. 
    20512000             */ 
  • trunk/src/VBox/VMM/PGMInternal.h

    r4978 r5017  
    3131#include <VBox/dbgf.h> 
    3232#include <VBox/log.h> 
     33#include <VBox/gmm.h> 
    3334#include <iprt/avl.h> 
    3435#include <iprt/assert.h> 
     
    550551                                        do { (pPage)->HCPhys = (((pPage)->HCPhys) & UINT64_C(0xffff000000000fff)) \ 
    551552                                                             | ((_HCPhys) & UINT64_C(0x0000fffffffff000)); } while (0) 
    552  
    553 /** The chunk shift. (2^20 = 1 MB) */ 
    554 #define GMM_CHUNK_SHIFT                 20 
    555 /** The allocation chunk size. */ 
    556 #define GMM_CHUNK_SIZE                  (1U << GMM_CHUNK_SHIFT) 
    557 /** The shift factor for converting a page id into a chunk id. */ 
    558 #define GMM_CHUNKID_SHIFT               (GMM_CHUNK_SHIFT - PAGE_SHIFT) 
    559 /** The NIL Chunk ID value. */ 
    560 #define NIL_GMM_CHUNKID                 0 
    561 /** The NIL Page ID value. */ 
    562 #define NIL_GMM_PAGEID                  0 
    563553 
    564554/** 

© 2008 Sun Microsystems, Inc.
ContactPrivacy policy