VirtualBox

Ticket #20163 (new defect)

Opened 2 months ago

Last modified 8 weeks ago

kernel 5.10.11 panic at VM start, when CONFIG_DEBUG_SPINLOCK=y in kernel .config

Reported by: ozugzug Owned by:
Component: other Version: VirtualBox 6.1.18
Keywords: Cc:
Guest type: X11 Host type: Linux

Description

this is on app-emulation/virtualbox-6.1.18::gentoo

kernel 5.10.11

No panic when CONFIG_DEBUG_SPINLOCK=n (which means CONFIG_DEBUG_LOCK_ALLOC=n also)

Also panic when using CONFIG_DEBUG_LOCK_ALLOC=y (which forces CONFIG_DEBUG_SPINLOCK=y)

Still panic when CONFIG_DEBUG_SPINLOCK=y and CONFIG_DEBUG_LOCK_ALLOC=n

This has been mentioned in a comment(thanks!) in https://www.virtualbox.org/ticket/20055#comment:16

I'm happy to test any patches, or if you think the issue is on the kernel side...

Here are my notes, I'm not sure how useful all this is but I'm happy to do more things if you tell me what:

Using /usr/src/.config.prev18_DEBUGlocks

KERNEL: /var/crash/vmlinux-5.10.11-gentoo-x86_64-2021-01-29-01_35_09

DUMPFILE: /var/crash/crashdump-2021-01-29-01_42_27 [PARTIAL DUMP]

Below, the word "crash" references  https://github.com/crash-utility/crash

/ dmesg seen by crash's 'log' command:

...
[   27.789537] zram1: detected capacity change from 0 to 206158430208
[   27.792783] zram2: detected capacity change from 0 to 206158430208
[   27.807234] vboxdrv: loading out-of-tree module taints kernel.
[   27.807944] vboxdrv: Found 12 processor cores
[   27.808318] ------------[ cut here ]------------
[   27.808318] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
[   27.808325] WARNING: CPU: 4 PID: 12692 at kernel/locking/lockdep.c:5281 check_flags.part.0+0x157/0x160
[   27.808325] Modules linked in: vboxdrv(O+) pcspkr
[   27.808328] CPU: 4 PID: 12692 Comm: modprobe Tainted: G     U     O      5.10.11-gentoo-x86_64 #1
[   27.808329] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2401 07/12/2019
[   27.808330] RIP: 0010:check_flags.part.0+0x157/0x160
[   27.808331] Code: c0 0f 84 3f 9c 04 01 44 8b 0d 0d cf 7c 02 45 85 c9 0f 85 2f 9c 04 01 48 c7 c6 2f 22 d3 9a 48 c7 c7 cb e5 d1 9a e8 82 52 04 01 <0f> 0b e9 15 9c 04 01 66 90 41 57 41 56 49 89 fe 41 55 41 89 d5 41
[   27.808332] RSP: 0018:ffffb32948e37a10 EFLAGS: 00010082
[   27.808333] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[   27.808334] RDX: 0000000000000000 RSI: ffffffff991424fc RDI: ffffffff9b065d40
[   27.808334] RBP: ffffffff9b768300 R08: 0000000000000001 R09: 0000000000000000
[   27.808335] R10: 0000000000000000 R11: fffffffffe02e9f8 R12: 0000000000000001
[   27.808335] R13: 0000000000000000 R14: ffffffff9b76c688 R15: 0000000000000046
[   27.808336] FS:  00007bcdb219fb80(0000) GS:ffff996609800000(0000) knlGS:0000000000000000
[   27.808337] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   27.808338] CR2: 00007bcdb232f445 CR3: 00000002d416e005 CR4: 00000000001706e0
[   27.808338] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   27.808339] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   27.808339] Call Trace:
[   27.808342]  ? lock_is_held_type+0x65/0x110
[   27.808344]  ? rcu_read_lock_sched_held+0x3a/0x70
[   27.808345]  ? lock_acquire+0x39a/0x480
[   27.808347]  ? ktime_get_ts64+0x46/0x280
[   27.808355]  ? VBoxHost_RTTimeSystemNanoTS+0x30/0x60 [vboxdrv]
[   27.808360]  ? VBoxHost_RTTimeSystemNanoTS+0x30/0x60 [vboxdrv]
[   27.808365]  ? supdrvLdrGetExportedSymbol+0x3342/0x3640 [vboxdrv]
[   27.808369]  ? supdrvGipCreate+0x820/0xc50 [vboxdrv]
[   27.808370]  ? rcu_read_lock_sched_held+0x3a/0x70
[   27.808371]  ? lockdep_init_map_waits+0x42/0x200
[   27.808375]  ? supdrvInitDevExt+0x14a/0x310 [vboxdrv]
[   27.808380]  ? init_module+0x91/0x1000 [vboxdrv]
[   27.808380]  ? 0xffffffffc0297000
[   27.808382]  ? do_one_initcall+0x7a/0x2d0
[   27.808383]  ? do_init_module+0x57/0x230
[   27.808384]  ? load_module+0x2445/0x26e0
[   27.808386]  ? __do_sys_finit_module+0xc0/0x100
[   27.808387]  ? do_syscall_64+0x33/0x40
[   27.808388]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   27.808388] irq event stamp: 5085
[   27.808390] hardirqs last  enabled at (5085): [<ffffffff99167cd5>] ktime_get_ts64+0x205/0x280
[   27.808390] hardirqs last disabled at (5084): [<ffffffff99167c9d>] ktime_get_ts64+0x1cd/0x280
[   27.808391] softirqs last  enabled at (5000): [<ffffffff9a400eff>] asm_call_irq_on_stack+0xf/0x20
[   27.808392] softirqs last disabled at (4995): [<ffffffff9a400eff>] asm_call_irq_on_stack+0xf/0x20
[   27.808392] ---[ end trace 3f8fdc7480d9e05c ]---
[   27.808393] possible reason: unannotated irqs-off.
[   27.808393] irq event stamp: 5085
[   27.808394] hardirqs last  enabled at (5085): [<ffffffff99167cd5>] ktime_get_ts64+0x205/0x280
[   27.808395] hardirqs last disabled at (5084): [<ffffffff99167c9d>] ktime_get_ts64+0x1cd/0x280
[   27.808395] softirqs last  enabled at (5000): [<ffffffff9a400eff>] asm_call_irq_on_stack+0xf/0x20
[   27.808396] softirqs last disabled at (4995): [<ffffffff9a400eff>] asm_call_irq_on_stack+0xf/0x20
[   27.823374] zram3: detected capacity change from 0 to 206158430208
[   27.824878] vboxdrv: TSC mode is Invariant, tentative frequency 3714564013 Hz
[   27.830217] xhci_hcd 0000:00:14.0: Cancel URB 00000000493c99df, dev 14, ep 0x81, starting at offset 0xfff92010
[   27.831594] vboxdrv: Successfully loaded version 6.1.18 (interface 0x00300000)
[   27.833271] xhci_hcd 0000:00:14.0: // Ding dong!
...
[   57.854519] elogind-daemon[13194]: New session 2 of user user.
[   62.295045] i2c i2c-1: NAK from device addr 0x50 msg #0
[   62.298587] i2c i2c-0: NAK from device addr 0x50 msg #0
[   62.302129] i2c i2c-2: NAK from device addr 0x50 msg #0
[   62.331645] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D0
[   62.344256] snd_hda_intel 0000:00:1f.3: PME# disabled
[   71.473189] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x0 (reading 0xa2f08086)
[   71.473206] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x4 (reading 0x100406)
[   71.473215] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x8 (reading 0x4030000)
[   71.473223] snd_hda_intel 0000:00:1f.3: saving config space at offset 0xc (reading 0x2010)
[   71.473230] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x10 (reading 0xfff20004)
[   71.473238] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x14 (reading 0x2f)
[   71.473258] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x18 (reading 0x0)
[   71.473260] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x1c (reading 0x0)
[   71.473263] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x20 (reading 0xfff00004)
[   71.473265] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x24 (reading 0x2f)
[   71.473267] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x28 (reading 0x0)
[   71.473270] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x2c (reading 0x87241043)
[   71.473272] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x30 (reading 0x0)
[   71.473274] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x34 (reading 0x50)
[   71.473277] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x38 (reading 0x0)
[   71.473279] snd_hda_intel 0000:00:1f.3: saving config space at offset 0x3c (reading 0x10b)
[   71.473297] snd_hda_intel 0000:00:1f.3: PME# enabled
[   71.485469] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot
[   72.024850] SUPR0GipMap: fGetGipCpu=0x1b
[   73.474107] BUGGY: kernel NULL pointer dereference, address: 0000000000000004
[   73.474120] #PF: supervisor read access in kernel mode
[   73.474126] #PF: error_code(0x0000) - not-present page
[   73.474131] PGD 0 P4D 0 
[   73.474143] Oops: 0000 [#1] SMP NOPTI
[   73.474151] CPU: 2 PID: 14833 Comm: EMT-0 Kdump: loaded Tainted: G     U  W  O      5.10.11-gentoo-x86_64 #1
[   73.474157] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2401 07/12/2019
[   73.474174] RIP: 0010:do_raw_spin_lock+0x4/0x90
[   73.474181] Code: 48 8d 88 b8 06 00 00 48 c7 c7 58 48 d3 9a e8 79 3a 04 01 e9 5b 3b 06 01 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 53 48 89 fb <8b> 47 04 3d ad 4e ad de 75 46 48 8b 53 10 65 48 8b 04 25 00 6f 01
[   73.474187] RSP: 0018:ffffb3294a0ebd30 EFLAGS: 00010296
[   73.474194] RAX: ffffffff992838a1 RBX: 0000000000000000 RCX: 0000000000000000
[   73.474199] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   73.474204] RBP: ffffb3294a0ebdf8 R08: 0000000000000001 R09: 0000000000000000
[   73.474209] R10: ffffffffc026f760 R11: 000000000007b438 R12: ffffffffc0240080
[   73.474213] R13: ffffb3294bea1000 R14: 0000000000000001 R15: ffffb3294a0ebde0
[   73.474220] FS:  0000754d3c1f4640(0000) GS:ffff996609600000(0000) knlGS:0000000000000000
[   73.474225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   73.474230] CR2: 0000000000000004 CR3: 00000002d7102004 CR4: 00000000001706e0
[   73.474234] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   73.474239] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   73.474243] Call Trace:
[   73.474255]  ? __apply_to_page_range+0x2e1/0x6a0
[   73.474286]  ? rtR0TermNative+0xd0/0x220 [vboxdrv]
[   73.474313]  ? rtR0MemObjNativeProtect+0x74/0xa0 [vboxdrv]
[   73.474338]  ? VBoxHost_RTR0MemObjProtect+0x81/0xc0 [vboxdrv]
[   73.474360]  ? supdrvIOCtl+0x3265/0x3800 [vboxdrv]
[   73.474379]  ? SUPR0Printf+0x22f/0x330 [vboxdrv]
[   73.474388]  ? __x64_sys_ioctl+0x7e/0xb0
[   73.474395]  ? do_syscall_64+0x33/0x40
[   73.474402]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   73.474407] Modules linked in: vboxnetadp(O) vboxnetflt(O) vboxdrv(O) pcspkr
[   73.474425] CR2: 0000000000000004

(it ended here)

/ Partial output of crash's "bt -sFlxg":

PID: 14833  TASK: ffff9959886acf80  CPU: 2   COMMAND: "EMT-0"
 #0 [ffffb3294a0eba68] machine_kexec+0x191 at ffffffff9904d3b1
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/mem_encrypt.h: 77
    ffffb3294a0eba70: 0000000000000018 0000ffff99574000 
    ffffb3294a0eba80: ffff995740000000 0000000f0a001000 
    ffffb3294a0eba90: ffff99664a001000 0000000f0a000000 
    ffffb3294a0ebaa0: 0000000000000000 cc4567c5b1908400 
    ffffb3294a0ebab0: ffffb3294a0ebc88 ffffb3294a0ebc88 
    ffffb3294a0ebac0: 0000000000000009 __crash_kexec+225 
 #1 [ffffb3294a0ebac8] __crash_kexec+0xe1 at ffffffff99188a21
    /usr/src/linux-5.10.11-gentoo/kernel/kexec_core.c: 963
    ffffb3294a0ebad0: ffffb3294a0ebde0 0000000000000001 
    ffffb3294a0ebae0: ffffb3294bea1000 rtR0TermNative+208 
    ffffb3294a0ebaf0: ffffb3294a0ebdf8 0000000000000000 
    ffffb3294a0ebb00: 000000000007b438 __this_module+992 
    ffffb3294a0ebb10: 0000000000000000 0000000000000001 
    ffffb3294a0ebb20: __apply_to_page_range+737 0000000000000000 
    ffffb3294a0ebb30: 0000000000000000 0000000000000000 
    ffffb3294a0ebb40: 0000000000000000 ffffffffffffffff 
    ffffb3294a0ebb50: do_raw_spin_lock+4 0000000000000010 
    ffffb3294a0ebb60: 0000000000010296 ffffb3294a0ebd30 
    ffffb3294a0ebb70: 0000000000000018 cc4567c5b1908400 
    ffffb3294a0ebb80: 0000000000000046 crash_kexec+52   
 #2 [ffffb3294a0ebb88] crash_kexec+0x34 at ffffffff99189954
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/atomic.h: 41
    ffffb3294a0ebb90: ffffb3294a0ebc88 oops_end+132     
 #3 [ffffb3294a0ebb98] oops_end+0x84 at ffffffff99022034
    /usr/src/linux-5.10.11-gentoo/arch/x86/kernel/dumpstack.c: 359
    ffffb3294a0ebba0: [task_struct]    ffffb3294a0ebc88 
    ffffb3294a0ebbb0: 0000000000000004 no_context+572   
    ffffb3294a0ebbc0: 0000000000000000 contig_page_data+3600 
    ffffb3294a0ebbd0: 0000000000000004 0000000000000000 
    ffffb3294a0ebbe0: 0000000000000000 0000000000000000 
    ffffb3294a0ebbf0: cc4567c5b1908400 0000000000000000 
    ffffb3294a0ebc00: ffffb3294a0ebc88 0000000000000000 
    ffffb3294a0ebc10: 0000000000000004 0000000000000000 
    ffffb3294a0ebc20: [mm_struct]      exc_page_fault+732 
 #4 [ffffb3294a0ebc28] exc_page_fault+0x2dc at ffffffff9a22a69c
    /usr/src/linux-5.10.11-gentoo/arch/x86/mm/fault.c: 1320
    ffffb3294a0ebc30: 0000000000000000 0000000000000000 
    ffffb3294a0ebc40: 0000000000000000 00042cc000000000 
    ffffb3294a0ebc50: 0000000000000000 0000000000000000 
    ffffb3294a0ebc60: 0000000000000000 0000000000000000 
    ffffb3294a0ebc70: 0000000000000000 0000000000000000 
    ffffb3294a0ebc80: asm_exc_page_fault+27 
 #5 [ffffb3294a0ebc80] asm_exc_page_fault+0x1b at ffffffff9a400acb
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/idtentry.h: 583
    ffffb3294a0ebc88: ffffb3294a0ebde0 0000000000000001 
    ffffb3294a0ebc98: ffffb3294bea1000 rtR0TermNative+208 
 #6 [ffffb3294a0ebca0] rtR0TermNative+0xd0 at ffffffffc0240080 [vboxdrv]
    ffffb3294a0ebca8: ffffb3294a0ebdf8 0000000000000000 
    ffffb3294a0ebcb8: 000000000007b438 __this_module+992 
    ffffb3294a0ebcc8: 0000000000000000 0000000000000001 
    ffffb3294a0ebcd8: __apply_to_page_range+737 
 #7 [ffffb3294a0ebcd8] __apply_to_page_range+0x2e1 at ffffffff992838a1
    /usr/src/linux-5.10.11-gentoo/include/linux/spinlock.h: 354
    ffffb3294a0ebce0: 0000000000000000 0000000000000000 
    ffffb3294a0ebcf0: 0000000000000000 0000000000000000 
    ffffb3294a0ebd00: ffffffffffffffff do_raw_spin_lock+4 
    ffffb3294a0ebd10: 0000000000000010 0000000000010296 
    ffffb3294a0ebd20: ffffb3294a0ebd30 0000000000000018 
    ffffb3294a0ebd30: ffffb3294beed000 __apply_to_page_range+737 
    ffffb3294a0ebd40: ffff9958401012f8 ffffb3294beed000 
    ffffb3294a0ebd50: [mm_struct]      ffff995858f00508 
    ffffb3294a0ebd60: 0000000000000000 ffffb3294beecfff 
    ffffb3294a0ebd70: ffffb3294beed000 ffff995840000528 
    ffffb3294a0ebd80: ffffb3294beed000 ffff995a17102b30 
    ffffb3294a0ebd90: ffffb3294beecfff rtR0TermNative+208 
 #8 [ffffb3294a0ebd98] rtR0TermNative+0xd0 at ffffffffc0240080 [vboxdrv]
    ffffb3294a0ebda0: ffffb3294beecfff [kmalloc-8k]     
    ffffb3294a0ebdb0: ffffb3294a0ebdf8 0000000000000000 
    ffffb3294a0ebdc0: 0000000000000000 ffffb3294c609010 
    ffffb3294a0ebdd0: [kmalloc-192]    rtR0MemObjNativeProtect+116 
 #9 [ffffb3294a0ebdd8] rtR0MemObjNativeProtect+0x74 at ffffffffc0241774 [vboxdrv]
    ffffb3294a0ebde0: [kmalloc-8k]     8000000000000161 
    ffffb3294a0ebdf0: cc4567c5b1908400 ffffb3294a0ebe10 
    ffffb3294a0ebe00: VBoxHost_RTR0MemObjProtect+129 
#10 [ffffb3294a0ebe00] VBoxHost_RTR0MemObjProtect+0x81 at ffffffffc023ecd1 [vboxdrv]
    ffffb3294a0ebe08: __this_module+992 ffffb3294a0ebea8 
    ffffb3294a0ebe18: supdrvIOCtl+12901 
#11 [ffffb3294a0ebe18] supdrvIOCtl+0x3265 at ffffffffc02339a5 [vboxdrv]
    ffffb3294a0ebe20: 0000000000000004 [kmalloc-2k]     
    ffffb3294a0ebe30: __this_module+992 00000004001fc060 
    ffffb3294a0ebe40: 00000000000004ff 0000000000205900 
    ffffb3294a0ebe50: __this_module+992 0000000000000000 
    ffffb3294a0ebe60: 0000000000000000 0000000000000000 
    ffffb3294a0ebe70: ffffb3294a0ebea8 cc4567c5b1908400 
    ffffb3294a0ebe80: 0000000000205978 0000000000005684 
    ffffb3294a0ebe90: 0000754d24911010 [kmalloc-2k]     
    ffffb3294a0ebea0: ffffb3294c609010 ffffb3294a0ebf08 
    ffffb3294a0ebeb0: SUPR0Printf+559  
#12 [ffffb3294a0ebeb0] SUPR0Printf+0x22f at ffffffffc022b67f [vboxdrv]
    RIP: 0000754d82eb9e37  RSP: 0000754d3c1f2b58  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000754d24911010  RCX: 0000754d82eb9e37
    RDX: 0000754d24911010  RSI: 0000000000005684  RDI: 000000000000000b
    RBP: 0000754d3c1f2b70   R8: 0000000000000000   R9: 00000000fffffffc
    R10: 0000000000000000  R11: 0000000000000246  R12: 0000754d6ef3afcf
    R13: 0000000000000000  R14: 0000754d3c1f2e20  R15: 0000000000000004
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

/

crash> mod
     MODULE       NAME              BASE          SIZE  OBJECT FILE
ffffffffc0225100  pcspkr      ffffffffc0223000   16384  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc026f380  vboxdrv     ffffffffc022b000  438272  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc02b0040  vboxnetflt  ffffffffc02ab000   28672  (not loaded)  [CONFIG_KALLSYMS]
ffffffffc02b9180  vboxnetadp  ffffffffc02b7000   28672  (not loaded)  [CONFIG_KALLSYMS]

/

crash> add-symbol-file /usr/lib/debug/lib/modules/5.10.11-gentoo-x86_64/misc/vboxdrv.ko.debug 0xffffffffc022b000
add symbol table from file "/usr/lib/debug/lib/modules/5.10.11-gentoo-x86_64/misc/vboxdrv.ko.debug" at
        .text_addr = 0xffffffffc022b000
Reading symbols from /usr/lib/debug/lib/modules/5.10.11-gentoo-x86_64/misc/vboxdrv.ko.debug...done.

/

crash> info line *rtR0TermNative+0xd0
Line 540 of "/usr/src/debug/app-emulation/virtualbox-modules-6.1.18/vboxdrv/r0drv/linux/memobj-r0drv-linux.c" starts at address 0xffffffffc0240080 <rtR0MemObjLinuxApplyPageRange> and ends at 0xffffffffc0240084 <rtR0MemObjLinuxApplyPageRange+4>.
crash> info line *rtR0MemObjLinuxApplyPageRange+4
Line 542 of "/usr/src/debug/app-emulation/virtualbox-modules-6.1.18/vboxdrv/r0drv/linux/memobj-r0drv-linux.c" starts at address 0xffffffffc0240084 <rtR0MemObjLinuxApplyPageRange+4> and ends at 0xffffffffc0240087 <rtR0MemObjLinuxApplyPageRange+7>.
  /**                                                                                                                                    
   * Callback called in apply_to_page_range().                                                                                           
   *                                                                                                                                     
   * @returns Linux status code.                                                                                                         
   * @param   pPte                Pointer to the page table entry for the given address.                                                 
   * @param   uAddr               The address to apply the new protection to.                                                            
   * @param   pvUser              The opaque user data.                                                                                  
   */                                                                                                                                    
  static DECLCALLBACK(int) rtR0MemObjLinuxApplyPageRange(pte_t *pPte, unsigned long uAddr, void *pvUser)                                 
  { //this is line 540                                                                                                                   
      PCLNXAPPLYPGRANGE pArgs = (PCLNXAPPLYPGRANGE)pvUser;                                                                               
      PRTR0MEMOBJLNX pMemLnx = pArgs->pMemLnx;  //this is line 542                                                                                         
      size_t idxPg = (uAddr - (unsigned long)pMemLnx->Core.pv) >> PAGE_SHIFT;                                                            
                                                                                                                                         
      set_pte(pPte, mk_pte(pMemLnx->apPages[idxPg], pArgs->fPg));                                                                        
      return 0;                                                                                                                          
  }                                                                                                                                      
  #endif

/

crash> info line *rtR0TermNative+208
Line 540 of "/usr/src/debug/app-emulation/virtualbox-modules-6.1.18/vboxdrv/r0drv/linux/memobj-r0drv-linux.c" starts at address 0xffffffffc0240080 <rtR0MemObjLinuxApplyPageRange> and ends at 0xffffffffc0240084 <rtR0MemObjLinuxApplyPageRange+4>.
(ok this doesn't seem to be useful?)

/

crash> info line *rtR0TermNative
Line 117 of "/usr/src/debug/app-emulation/virtualbox-modules-6.1.18/vboxdrv/r0drv/linux/initterm-r0drv-linux.c" starts at address 0xffffffffc023ffb0 <rtR0TermNative> and ends at 0xffffffffc023ffb4 <rtR0TermNative+4>.

  DECLHIDDEN(void) rtR0TermNative(void)                                                                                                  
  { //this is line 117                                                                                                                   
      IPRT_LINUX_SAVE_EFL_AC();                                                                                                          
                                                                                                                                         
      rtR0LnxWorkqueueFlush();                                                                                                           
  #if RTLNX_VER_MIN(2,5,41)                                                                                                              
      destroy_workqueue(g_prtR0LnxWorkQueue);                                                                                            
      g_prtR0LnxWorkQueue = NULL;                                                                                                        
  #endif                                                                                                                                 
                                                                                                                                         
      rtR0MemExecCleanup();                                                                                                              
                                                                                                                                         
      IPRT_LINUX_RESTORE_EFL_AC();                                                                                                       
  }

/

crash> info line *__apply_to_page_range+0x2e1
No line number information available for address 0xffffffff992838a1
(don't know how to make this work, tho it does work in the output of 'bt -l')
crash> sym 0xffffffff992838a1
ffffffff992838a1 (t) __apply_to_page_range+737 /usr/src/linux-5.10.11-gentoo/include/linux/spinlock.h: 354
crash> info line *__apply_to_page_range+737
No line number information available for address 0xffffffff992838a1
//ok so 0x2e1 is 737 in decimal
crash> sym 0xffffffff992838a1
ffffffff992838a1 (t) __apply_to_page_range+737 /usr/src/linux-5.10.11-gentoo/include/linux/spinlock.h: 354

 #7 [ffffb3294a0ebcd8] __apply_to_page_range+0x2e1 at ffffffff992838a1
    /usr/src/linux-5.10.11-gentoo/include/linux/spinlock.h: 354

  static __always_inline void spin_lock(spinlock_t *lock)                                                                                
  {                                                                                                                                      
    raw_spin_lock(&lock->rlock); // this is line 354                                                                                     
  }

/


Another crash, now with the following .config diff (compared to /usr/src/.config.prev18_nondebug):

-CONFIG_DEBUG_LOCK_ALLOC=y
-CONFIG_LOCKDEP=y
-# CONFIG_DEBUG_LOCKDEP is not set
+# CONFIG_DEBUG_LOCK_ALLOC is not set

KERNEL: /var/crash/vmlinux-5.10.11-gentoo-x86_64-2021-01-29-19_54_43

DUMPFILE: /var/crash/crashdump-2021-01-29-20_22_50 [PARTIAL DUMP]

'log' partial output:

...
[   14.857357] vboxdrv: loading out-of-tree module taints kernel.
[   14.857929] vboxdrv: Found 12 processor cores
[   14.873771] vboxdrv: TSC mode is Invariant, tentative frequency 3695998457 Hz
[   14.873771] vboxdrv: Successfully loaded version 6.1.18 (interface 0x00300000)
...
[   56.083785] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot
[   61.037910] SUPR0GipMap: fGetGipCpu=0x1b
[   62.739845] BUGGY: kernel NULL pointer dereference, address: 0000000000000004
[   62.739856] #PF: supervisor read access in kernel mode
[   62.739862] #PF: error_code(0x0000) - not-present page
[   62.739866] PGD 0 P4D 0 
[   62.739876] Oops: 0000 [#1] SMP NOPTI
[   62.739883] CPU: 9 PID: 14846 Comm: EMT-0 Kdump: loaded Tainted: G     U     O      5.10.11-gentoo-x86_64 #1
[   62.739887] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2401 07/12/2019
[   62.739897] RIP: 0010:do_raw_spin_lock+0x4/0x90
[   62.739904] Code: 48 8d 88 90 06 00 00 48 c7 c7 f0 c7 aa 94 e8 f6 be e4 00 e9 33 b4 e6 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 53 48 89 fb <8b> 47 04 3d ad 4e ad de 75 46 48 8b 53 10 65 48 8b 04 25 40 6d 01
[   62.739909] RSP: 0018:ffffb532498a7d30 EFLAGS: 00010286
[   62.739915] RAX: ffff997800000588 RBX: 0000000000000000 RCX: 0000000000000000
[   62.739920] RDX: 0000000101207067 RSI: 00000000040481c0 RDI: 0000000000000000
[   62.739923] RBP: ffffb532498a7df8 R08: ffff997901207588 R09: 0000000000000001
[   62.739927] R10: ffffffffc0273560 R11: 0000000000061918 R12: ffffffffc0244000
[   62.739931] R13: ffffb532498b1000 R14: 0000000000000001 R15: ffffb532498a7de0
[   62.739937] FS:  0000751ec8366640(0000) GS:ffff99880a480000(0000) knlGS:0000000000000000
[   62.739941] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.739945] CR2: 0000000000000004 CR3: 000000032b99c004 CR4: 00000000001706e0
[   62.739949] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   62.739954] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   62.739957] Call Trace:
[   62.739967]  ? __apply_to_page_range+0x2e1/0x6a0
[   62.739999]  ? rtR0TermNative+0xd0/0x220 [vboxdrv]
[   62.740026]  ? rtR0MemObjNativeProtect+0x74/0xa0 [vboxdrv]
[   62.740051]  ? VBoxHost_RTR0MemObjProtect+0x81/0xc0 [vboxdrv]
[   62.740074]  ? supdrvIOCtl+0x3265/0x3800 [vboxdrv]
[   62.740094]  ? SUPR0Printf+0x22f/0x330 [vboxdrv]
[   62.740101]  ? __x64_sys_ioctl+0x7e/0xb0
[   62.740109]  ? do_syscall_64+0x33/0x40
[   62.740117]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   62.740121] Modules linked in: vboxnetadp(O) vboxnetflt(O) vboxdrv(O) pcspkr
[   62.740136] CR2: 0000000000000004
crash> bt -l
PID: 14846  TASK: ffff99792fa78000  CPU: 9   COMMAND: "EMT-0"
 #0 [ffffb532498a7a68] machine_kexec at ffffffff93042956
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/mem_encrypt.h: 77
 #1 [ffffb532498a7ac8] __crash_kexec at ffffffff9314bee1
    /usr/src/linux-5.10.11-gentoo/kernel/kexec_core.c: 963
 #2 [ffffb532498a7b88] crash_kexec at ffffffff9314cdf4
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/atomic.h: 41
 #3 [ffffb532498a7b98] oops_end at ffffffff9301cd84
    /usr/src/linux-5.10.11-gentoo/arch/x86/kernel/dumpstack.c: 359
 #4 [ffffb532498a7c28] exc_page_fault at ffffffff94002016
    /usr/src/linux-5.10.11-gentoo/arch/x86/mm/fault.c: 1346
 #5 [ffffb532498a7c80] asm_exc_page_fault at ffffffff94200acb
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/idtentry.h: 583
 #6 [ffffb532498a7ca0] rtR0TermNative at ffffffffc0244000 [vboxdrv]
 #7 [ffffb532498a7d08] do_raw_spin_lock at ffffffff9310dc54
    /usr/src/linux-5.10.11-gentoo/kernel/locking/spinlock_debug.c: 112
 #8 [ffffb532498a7d38] __apply_to_page_range at ffffffff931feb71
    /usr/src/linux-5.10.11-gentoo/include/linux/spinlock.h: 354
 #9 [ffffb532498a7dd8] rtR0MemObjNativeProtect at ffffffffc02456f4 [vboxdrv]
#10 [ffffb532498a7e00] VBoxHost_RTR0MemObjProtect at ffffffffc0242c81 [vboxdrv]
#11 [ffffb532498a7e18] supdrvIOCtl at ffffffffc0237955 [vboxdrv]
#12 [ffffb532498a7eb0] SUPR0Printf at ffffffffc022f62f [vboxdrv]
    RIP: 0000751f0b85ae37  RSP: 0000751ec8364b58  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000751ead314010  RCX: 0000751f0b85ae37
    RDX: 0000751ead314010  RSI: 0000000000005684  RDI: 000000000000000b
    RBP: 0000751ec8364b70   R8: 0000000000000000   R9: 00000000fffffffc
    R10: 0000000000000000  R11: 0000000000000246  R12: 0000751efc3aefcf
    R13: 0000000000000000  R14: 0000751ec8364e20  R15: 0000000000000004
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

/

 /*                                                                                                                                     
   * We are now relying on the NMI watchdog to detect lockup instead of doing                                                            
   * the detection here with an unfair lock which can cause problem of its own.                                                          
   */                                                                                                                                    
  void do_raw_spin_lock(raw_spinlock_t *lock)                                                                                            
  {                                                                                                                                      
    debug_spin_lock_before(lock); // this is line 112                                                                                    
    arch_spin_lock(&lock->raw_lock);                                                                                                     
    mmiowb_spin_lock();                                                                                                                  
    debug_spin_lock_after(lock);                                                                                                         
  }

/

PID: 14846  TASK: ffff99792fa78000  CPU: 9   COMMAND: "EMT-0"
 #0 [ffffb532498a7a68] machine_kexec+0x186 at ffffffff93042956
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/mem_encrypt.h: 77
    ffffb532498a7a70: 00000000498a7d30 0000ffff99780000 
    ffffb532498a7a80: ffff997800000000 000000003d001000 
    ffffb532498a7a90: ffff99783d001000 000000003d000000 
    ffffb532498a7aa0: 0000000000000000 c2c563b4d1c59e00 
    ffffb532498a7ab0: ffffb532498a7c88 ffffb532498a7c88 
    ffffb532498a7ac0: 0000000000000009 __crash_kexec+225 
 #1 [ffffb532498a7ac8] __crash_kexec+0xe1 at ffffffff9314bee1
    /usr/src/linux-5.10.11-gentoo/kernel/kexec_core.c: 963
    ffffb532498a7ad0: ffffb532498a7de0 0000000000000001 
    ffffb532498a7ae0: ffffb532498b1000 rtR0TermNative+208 
    ffffb532498a7af0: ffffb532498a7df8 0000000000000000 
    ffffb532498a7b00: 0000000000061918 __this_module+864 
    ffffb532498a7b10: 0000000000000001 ffff997901207588 
    ffffb532498a7b20: ffff997800000588 0000000000000000 
    ffffb532498a7b30: 0000000101207067 00000000040481c0 
    ffffb532498a7b40: 0000000000000000 ffffffffffffffff 
    ffffb532498a7b50: do_raw_spin_lock+4 0000000000000010 
    ffffb532498a7b60: 0000000000010286 ffffb532498a7d30 
    ffffb532498a7b70: 0000000000000018 c2c563b4d1c59e00 
    ffffb532498a7b80: 0000000000000246 crash_kexec+52   
 #2 [ffffb532498a7b88] crash_kexec+0x34 at ffffffff9314cdf4
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/atomic.h: 41
    ffffb532498a7b90: ffffb532498a7c88 oops_end+132     
 #3 [ffffb532498a7b98] oops_end+0x84 at ffffffff9301cd84
    /usr/src/linux-5.10.11-gentoo/arch/x86/kernel/dumpstack.c: 359
    ffffb532498a7ba0: [task_struct]    ffffb532498a7c88 
    ffffb532498a7bb0: 0000000000000004 no_context+572   
    ffffb532498a7bc0: 0000000000000000 0000000000000000 
    ffffb532498a7bd0: 0000000000000000 0000000000000000 
    ffffb532498a7be0: 0000000000000000 0000000000000000 
    ffffb532498a7bf0: c2c563b4d1c59e00 0000000000000214 
    ffffb532498a7c00: 0000000000000000 ffffb532498a7c88 
    ffffb532498a7c10: 0000000000000004 0000000000000000 
    ffffb532498a7c20: [mm_struct]      exc_page_fault+790 
 #4 [ffffb532498a7c28] exc_page_fault+0x316 at ffffffff94002016
    /usr/src/linux-5.10.11-gentoo/arch/x86/mm/fault.c: 1346
    ffffb532498a7c30: [mm_struct]      0000000000000000 
    ffffb532498a7c40: 0000000000000000 00000000000001d0 
    ffffb532498a7c50: 0000000000000000 0000000000000000 
    ffffb532498a7c60: 0000000000000000 0000000000000000 
    ffffb532498a7c70: 0000000000000000 0000000000000000 
    ffffb532498a7c80: asm_exc_page_fault+27 
 #5 [ffffb532498a7c80] asm_exc_page_fault+0x1b at ffffffff94200acb
    /usr/src/linux-5.10.11-gentoo/arch/x86/include/asm/idtentry.h: 583
    ffffb532498a7c88: ffffb532498a7de0 0000000000000001 
    ffffb532498a7c98: ffffb532498b1000 rtR0TermNative+208 
 #6 [ffffb532498a7ca0] rtR0TermNative+0xd0 at ffffffffc0244000 [vboxdrv]
    ffffb532498a7ca8: ffffb532498a7df8 0000000000000000 
    ffffb532498a7cb8: 0000000000061918 __this_module+864 
    ffffb532498a7cc8: 0000000000000001 ffff997901207588 
    ffffb532498a7cd8: ffff997800000588 0000000000000000 
    ffffb532498a7ce8: 0000000101207067 00000000040481c0 
    ffffb532498a7cf8: 0000000000000000 ffffffffffffffff 
    ffffb532498a7d08: do_raw_spin_lock+4 
 #7 [ffffb532498a7d08] do_raw_spin_lock+0x4 at ffffffff9310dc54
    /usr/src/linux-5.10.11-gentoo/kernel/locking/spinlock_debug.c: 112
    ffffb532498a7d10: 0000000000000010 0000000000010286 
    ffffb532498a7d20: ffffb532498a7d30 0000000000000018 
    ffffb532498a7d30: ffffb532498fd000 __apply_to_page_range+737 
 #8 [ffffb532498a7d38] __apply_to_page_range+0x2e1 at ffffffff931feb71
    /usr/src/linux-5.10.11-gentoo/include/linux/spinlock.h: 354
    ffffb532498a7d40: ffff997900067260 ffffb532498fd000 
    ffffb532498a7d50: [mm_struct]      ffff997901207588 
    ffffb532498a7d60: 0000000000000000 ffffb532498fcfff 
    ffffb532498a7d70: ffffb532498fd000 ffff997900000648 
    ffffb532498a7d80: ffffb532498fd000 ffff997b2b99cb50 
    ffffb532498a7d90: ffffb532498fcfff rtR0TermNative+208 
    ffffb532498a7da0: ffffb532498fcfff [kmalloc-8k]     
    ffffb532498a7db0: ffffb532498a7df8 0000000000000000 
    ffffb532498a7dc0: 0000000000000000 ffffb5324a109010 
    ffffb532498a7dd0: [kmalloc-192]    rtR0MemObjNativeProtect+116 
 #9 [ffffb532498a7dd8] rtR0MemObjNativeProtect+0x74 at ffffffffc02456f4 [vboxdrv]
    ffffb532498a7de0: [kmalloc-8k]     8000000000000161 
    ffffb532498a7df0: c2c563b4d1c59e00 ffffb532498a7e10 
    ffffb532498a7e00: VBoxHost_RTR0MemObjProtect+129 
#10 [ffffb532498a7e00] VBoxHost_RTR0MemObjProtect+0x81 at ffffffffc0242c81 [vboxdrv]
    ffffb532498a7e08: __this_module+864 ffffb532498a7ea8 
    ffffb532498a7e18: supdrvIOCtl+12901 
#11 [ffffb532498a7e18] supdrvIOCtl+0x3265 at ffffffffc0237955 [vboxdrv]
    ffffb532498a7e20: 0000000000000004 [kmalloc-2k]     
    ffffb532498a7e30: __this_module+864 00000004001fc060 
    ffffb532498a7e40: 00000000000004ff 0000000000205900 
    ffffb532498a7e50: __this_module+864 0000000000000000 
    ffffb532498a7e60: 0000000000000000 0000000000000000 
    ffffb532498a7e70: 0000000000205978 c2c563b4d1c59e00 
    ffffb532498a7e80: 0000000000205978 0000000000005684 
    ffffb532498a7e90: 0000751ead314010 [kmalloc-2k]     
    ffffb532498a7ea0: ffffb5324a109010 ffffb532498a7f08 
    ffffb532498a7eb0: SUPR0Printf+559  
#12 [ffffb532498a7eb0] SUPR0Printf+0x22f at ffffffffc022f62f [vboxdrv]
    RIP: 0000751f0b85ae37  RSP: 0000751ec8364b58  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000751ead314010  RCX: 0000751f0b85ae37
    RDX: 0000751ead314010  RSI: 0000000000005684  RDI: 000000000000000b
    RBP: 0000751ec8364b70   R8: 0000000000000000   R9: 00000000fffffffc
    R10: 0000000000000000  R11: 0000000000000246  R12: 0000751efc3aefcf
    R13: 0000000000000000  R14: 0000751ec8364e20  R15: 0000000000000004
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

/

 #9 [ffffb532498a7dd8] rtR0MemObjNativeProtect+0x74 at ffffffffc02456f4 [vboxdrv]
crash> info line *0xffffffffc02456f4
Line 1919 of "/usr/src/debug/app-emulation/virtualbox-modules-6.1.18/vboxdrv/r0drv/linux/memobj-r0drv-linux.c" starts at address 0xffffffffc02456f4 <rtR0MemObjNativeProtect+116> and ends at 0xffffffffc0245720 <rtR0MemObjNativeGetPagePhysAddr>.
  # elif defined(IPRT_USE_APPLY_TO_PAGE_RANGE_FOR_EXEC)                                                                                                    
      PRTR0MEMOBJLNX pMemLnx = (PRTR0MEMOBJLNX)pMem;                                                                                                       
      if (   pMemLnx->fExecutable                                                                                                                          
          && pMemLnx->fMappedToRing0)                                                                                                                      
      {                                                                                                                                                    
          LNXAPPLYPGRANGE Args;                                                                                                                            
          Args.pMemLnx = pMemLnx;                                                                                                                          
          Args.fPg = rtR0MemObjLinuxConvertProt(fProt, true /*fKernel*/);                                                                                  
          int rcLnx = apply_to_page_range(current->active_mm, (unsigned long)pMemLnx->Core.pv + offSub, cbSub,                                             
                                          rtR0MemObjLinuxApplyPageRange, (void *)&Args);                                                                   
          if (rcLnx)  //this is line 1919   !!!!!!!!!!!!!!!!!!!!!!!!!!!  which seems off by one (a normal thing) 
              return VERR_NOT_SUPPORTED;                                                                                                                   
                                                                                                                                                           
          return VINF_SUCCESS;                                                                                                                             
      }                                                                                                                                                    
  # endif                       

no crash now with the following .config diff (compared to previous):

-CONFIG_UNINLINE_SPIN_UNLOCK=y
+CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
+CONFIG_INLINE_READ_UNLOCK=y
+CONFIG_INLINE_READ_UNLOCK_IRQ=y
+CONFIG_INLINE_WRITE_UNLOCK=y
+CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
-CONFIG_DEBUG_SPINLOCK=y
+# CONFIG_DEBUG_SPINLOCK is not set

so to avoid crash, you need(at least):

CONFIG_DEBUG_SPINLOCK=n (which forces CONFIG_DEBUG_LOCK_ALLOC=n)

but doesn't matter if =y on the following:

CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y

nvmTODO: try with:

+# CONFIG_DEBUG_SPINLOCK is not set

but with

CONFIG_DEBUG_LOCK_ALLOC=y

Can't, because if CONFIG_DEBUG_LOCK_ALLOC=y then it's forced that CONFIG_DEBUG_SPINLOCK=y !


Change History

comment:1 Changed 8 weeks ago by zardoz

Also seeing a kernel panic with Fedora 32 and 5.10.x kernel. No issues with 5.9.x kernels.

Pretty much a show stopper.

Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use