VirtualBox

Ticket #1670 (closed defect: fixed)

Opened 6 years ago

Last modified 5 years ago

VBoxDrv broken on opensolaris snv_91, any attempt to start a virtual machine core dumps

Reported by: jkeil Owned by:
Priority: major Component: other
Version: VirtualBox 1.6.0 Keywords:
Cc: Guest type: other
Host type: other

Description

I have an amd64 box running Solaris Express Community Edition snv_85 X86. The box had been bfu'ed to newer kernel bits multiple times. virtualbox 1.6.0 used to work fine until a week or two ago.

But since ~ May 28th I'm unable to start any virtual machine; a window is opened, but a VirtualBox process is immediately core dumping. The following gets logged to /var/adm/messages:

Jun  5 16:27:33 tiger2 vboxdrv: [ID 147122 kern.notice] VBoxDrv: VBoxDrvSolarisOpen: Dev=0x0 pSession=ffffff031d195810 pid=1408 r0proc=ffffff031c3f7070 thread=ffffff02e5268d80
Jun  5 16:27:34 tiger2 vboxdrv: [ID 641266 kern.notice] NOTICE: rtR0MemObjNativeLockUser: as_pagelock failed to get shadow pages
Jun  5 16:27:48 tiger2 genunix: [ID 603404 kern.notice] NOTICE: core_log: VirtualBox[1408] core dumped: /cores/VirtualBox-1408

Core dump always looks the same, it's crashing in mmR3PagePoolTerm():

# pflags /cores/VirtualBox-1408
core '/cores/VirtualBox-1408' of 1408:	/opt/VirtualBox/VirtualBox -comment Solaris 11 -startvm 3ba70c22-b8fb-
	data model = _LP64  flags = ORPHAN|MSACCT|MSFORK
 /1:	flags = STOPPED  pollsys(0xfffffd7fffdf9c00,0x4,0xfffffd7fffdf9dc0,0x0)
	why = PR_SUSPENDED
	sigmask = 0x00002000,0x00000000
 /2:	flags = STOPPED  pollsys(0xfffffd7ffcb1fd10,0x2,0x0,0x0)
	why = PR_SUSPENDED
	sigmask = 0x00002000,0x00000000
 /3:	flags = STOPPED  lwp_park(0x0,0x0,0x0)
	why = PR_SUSPENDED
	sigmask = 0x00002000,0x00000000
 /4:	flags = DETACH|STOPPED  lwp_park(0x0,0x0,0x0)
	why = PR_SUSPENDED
	sigmask = 0x00002000,0x00000000
 /5:	flags = DETACH|STOPPED  lwp_park(0x0,0x0,0x0)
	why = PR_SUSPENDED
	sigmask = 0x00002000,0x00000000
 /6:	flags = DETACH
	sigmask = 0xffffbefc,0x0000ffff  cursig = SIGSEGV
 /7:	flags = DAEMON|STOPPED  lwp_park(0x0,0xfffffd7ffc70cf20,0x0)
	why = PR_SUSPENDED
	sigmask = 0xffbffeff,0x0000fff7
 /8:	flags = DETACH|STOPPED  lwp_park(0x0,0x0,0x0)
	why = PR_SUSPENDED
	sigmask = 0x00002000,0x00000000
 /9:	flags = DETACH|STOPPED  pollsys(0xfffffd7ffc438a60,0x1,0xfffffd7ffc438c40,0x0)
	why = PR_SUSPENDED
	sigmask = 0x00002000,0x00000000
# pstack /cores/VirtualBox-1408
core '/cores/VirtualBox-1408' of 1408:	/opt/VirtualBox/VirtualBox -comment Solaris 11 -startvm 3ba70c22-b8fb-
-----------------  lwp# 1 / thread# 1  --------------------
 fffffd7ffdf659ca __pollsys () + a
 fffffd7ffdf162c4 pselect () + 1d4
 fffffd7ffdf16611 select () + 71
 fffffd7ffebed939 _ZN10QEventLoop13processEventsEj () + 289
 fffffd7ffec52738 _ZN10QEventLoop9enterLoopEv () + 48
 000000000055e928 _ZN18VBoxProgressDialog3runEi () + 68
 000000000055ec62 _ZN19VBoxProblemReporter23showModalProgressDialogER9CProgressRK7QStringP7QWidgeti () + 312
 0000000000583fa9 _ZN14VBoxConsoleWnd16finalizeOpenViewEv () + 1f9
 00000000004b4d2f _ZN14VBoxConsoleWnd9qt_invokeEiP8QUObject () + 44f
 fffffd7ffec97006 _ZN7QObject15activate_signalEP15QConnectionListP8QUObject () + 136
 fffffd7ffeff6999 _ZN7QSignal6signalERK8QVariant () + 99
 fffffd7ffecaf4b8 _ZN7QSignal8activateEv () + 78
 fffffd7ffecb75aa _ZN16QSingleShotTimer5eventEP6QEvent () + 2a
 fffffd7ffec3d93d _ZN12QApplication14internalNotifyEP7QObjectP6QEvent () + 9d
 fffffd7ffec3daef _ZN12QApplication6notifyEP7QObjectP6QEvent () + 7f
 fffffd7ffec3183c _ZN10QEventLoop14activateTimersEv () + 2ac
 fffffd7ffebedf2d _ZN10QEventLoop13processEventsEj () + 87d
 fffffd7ffec52738 _ZN10QEventLoop9enterLoopEv () + 48
 fffffd7ffec5268a _ZN10QEventLoop4execEv () + 2a
 0000000000539e13 main () + 7a3
 00000000004afc4c _start () + 6c
-----------------  lwp# 2 / thread# 2  --------------------
 fffffd7ffdf659ca __pollsys () + a
 fffffd7ffdf11a43 poll () + 63
 fffffd7ffe3764be _pr_poll_with_poll () + 3b4
 fffffd7ffe376649 PR_Poll () + 9
 fffffd7ffcb980c1 _Z10ConnThreadPv () + 41
 fffffd7ffe37831a _pt_root () + 90
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()
-----------------  lwp# 3 / thread# 3  --------------------
 fffffd7ffdf5e2b7 __lwp_park () + 17
 fffffd7ffdf581f7 cond_wait_queue () + 47
 fffffd7ffdf586af __cond_wait () + 5f
 fffffd7ffdf586f3 cond_wait () + 23
 fffffd7ffdf58719 pthread_cond_wait () + 9
 fffffd7ffe3779cd PR_WaitCondVar () + 6b
 fffffd7ffe377cd9 PR_Wait () + 46
 fffffd7ffcb94bba _ZN14DConnectWorker3RunEv () + 124
 fffffd7ffe347ffe _ZN8nsThread4MainEPv () + 2e
 fffffd7ffe37831a _pt_root () + 90
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()
-----------------  lwp# 4 / thread# 4  --------------------
 fffffd7ffdf5e2b7 __lwp_park () + 17
 fffffd7ffdf581f7 cond_wait_queue () + 47
 fffffd7ffdf586af __cond_wait () + 5f
 fffffd7ffdf586f3 cond_wait () + 23
 fffffd7ffdf58719 pthread_cond_wait () + 9
 fffffd7fff2c3930 _Z19rtSemEventMultiWaitP23RTSEMEVENTMULTIINTERNALjb () + 1a0
 fffffd7ffca48b98 _ZN10HGCMThread6MsgGetEPP11HGCMMsgCore () + 38
 fffffd7ffca4908e _Z10hgcmMsgGetjPP11HGCMMsgCore () + 4e
 fffffd7ffca4aa63 _Z10hgcmThreadjPv () + 23
 fffffd7ffca48509 _Z20hgcmWorkerThreadFuncP11RTTHREADINTPv () + 39
 fffffd7fff2a5884 rtThreadMain () + 34
 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()
-----------------  lwp# 5 / thread# 5  --------------------
 fffffd7ffdf5e2b7 __lwp_park () + 17
 fffffd7ffdf581f7 cond_wait_queue () + 47
 fffffd7ffdf586af __cond_wait () + 5f
 fffffd7ffdf586f3 cond_wait () + 23
 fffffd7ffdf58719 pthread_cond_wait () + 9
 fffffd7fff2c3394 _Z14rtSemEventWaitP18RTSEMEVENTINTERNALjb () + 1a4
 fffffd7ffe757e75 VMR3ReqWait () + a5
 fffffd7ffe758372 VMR3ReqQueue () + 132
 fffffd7ffe75852c VMR3ReqCallVU () + 18c
 fffffd7ffe758615 VMR3ReqCallU () + 85
 fffffd7ffe755886 VMR3Create () + 1a6
 fffffd7ffc9daed9 _ZN7Console13powerUpThreadEP11RTTHREADINTPv () + 1f9
 fffffd7fff2a5884 rtThreadMain () + 34
 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()
-----------------  lwp# 6 / thread# 6  --------------------
 fffffd7ffe73f5b6 mmR3PagePoolTerm () + 16
 fffffd7ffe73d34f MMR3Term () + f
 fffffd7ffe73d488 MMR3Init () + a8
 fffffd7ffe755b09 _Z11vmR3CreateUP3UVMPFiP2VMPvES3_ () + 139
 fffffd7ffe75801b _Z18vmR3ReqProcessOneUP3UVMP5VMREQ () + 16b
 fffffd7ffe75820b VMR3ReqProcessU () + 6b
 fffffd7ffe75654d _Z19vmR3EmulationThreadP11RTTHREADINTPv () + bd
 fffffd7fff2a5884 rtThreadMain () + 34
 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()
-----------------  lwp# 7 / thread# 7  --------------------
 fffffd7ffdf5e2b7 __lwp_park () + 17
 fffffd7ffdf581f7 cond_wait_queue () + 47
 fffffd7ffdf58576 cond_wait_common () + 1d6
 fffffd7ffdf587cc __cond_timedwait () + 9c
 fffffd7ffdf58807 cond_timedwait () + 27
 fffffd7fff333893 umem_update_thread () + 193
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()
-----------------  lwp# 8 / thread# 8  --------------------
 fffffd7ffdf5e2b7 __lwp_park () + 17
 fffffd7ffdf581f7 cond_wait_queue () + 47
 fffffd7ffdf586af __cond_wait () + 5f
 fffffd7ffdf586f3 cond_wait () + 23
 fffffd7ffdf58719 pthread_cond_wait () + 9
 fffffd7fff2c3930 _Z19rtSemEventMultiWaitP23RTSEMEVENTMULTIINTERNALjb () + 1a0
 fffffd7ffca48b98 _ZN10HGCMThread6MsgGetEPP11HGCMMsgCore () + 38
 fffffd7ffca4908e _Z10hgcmMsgGetjPP11HGCMMsgCore () + 4e
 fffffd7ffca496c0 _Z17hgcmServiceThreadjPv () + 30
 fffffd7ffca48509 _Z20hgcmWorkerThreadFuncP11RTTHREADINTPv () + 39
 fffffd7fff2a5884 rtThreadMain () + 34
 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()
-----------------  lwp# 9 / thread# 9  --------------------
 fffffd7ffdf659ca __pollsys () + a
 fffffd7ffdf162c4 pselect () + 1d4
 fffffd7ffdf16611 select () + 71
 fffffd7ffd8c8369 IoWait () + 29
 fffffd7ffd8c79a0 _XtWaitForSomething () + 180
 fffffd7ffd8c7529 XtAppNextEvent () + 139
 fffffd7ffd8c73bb XtAppMainLoop () + 3b
 fffffd7ffc456521 _Z19vboxClipboardThreadP11RTTHREADINTPv () + 311
 fffffd7fff2a5884 rtThreadMain () + 34
 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c
 fffffd7ffdf5dfc9 _thr_setup () + 89
 fffffd7ffdf5e270 _lwp_start ()

I suspect that this is caused by onnv-gate changeset #6695

changeset 6695:  	12d7dd4459fd
parent:	7066e93e6b89
author: 	aguzovsk
date: 	Thu May 22 22:23:49 2008 -0700 (13 days ago)
permissions: 	-rw-r--r--
description:
6423097 segvn_pagelock() may perform very poorly
6526804 DR delete_memory_thread, AIO, and segvn deadlock
6557794 segspt_dismpagelock() and segspt_shmadvise(MADV_FREE) may deadlock
6557813 seg_ppurge_seg() shouldn't flush all unrelated ISM/DISM segments
6557891 softlocks/pagelocks of anon pages should not decrement availrmem for memory swapped pages
6559612 multiple softlocks on a DISM segment should decrement availrmem just once
6562291 page_mem_avail() is stuck due to availrmem overaccounting and lack of seg_preap() calls
6596555 locked anonymous pages should not have assigned disk swap slots
6639424 hat_sfmmu.c:hat_pagesync() doesn't handle well HAT_SYNC_STOPON_REF and HAT_SYNC_STOPON_MOD flags
6639425 optimize checkpage() optimizations
6662927 page_llock contention during I/O

The VBoxDrv kernel module is using as_pagelock() and expects that it returns a list of shadow pages, but I suspect that after the above putback the returned list of shadow pages is now NULL. This seems to confuse the VBoxDrv module.

Result is that MMR3Init() -> mmR3PagePoolInit() fails in virtualbox user land; and this ends with the VirtualBox process core dumping.

Change History

comment:1 follow-up: ↓ 3 Changed 6 years ago by ramshankar

This should be fixed in 1.6.2 where we've introduced a VirtualBox kernel interface to prevent such breakage. It will be out in a few days. Please try with 1.6.2 then and see if it still persists.

Remember while installing 1.6.2 you will have to install the VirtualBoxKern package first and then install VirtualBox. Hold on, it's due to be out very soon.

comment:2 Changed 6 years ago by jkeil

I was impatient and recompiled the vboxdrv kernel module from VirtualBox-1.6.0_OSE sources, with the tests for "returned shadow pages pointer != NULL" removed.

That fixed the problem (for now).

(And I'll upgrade to 1.6.2, when it becomes available...)

comment:3 in reply to: ↑ 1 Changed 6 years ago by jkeil

Replying to ramshankar:

This should be fixed in 1.6.2 where we've introduced a VirtualBox kernel interface to prevent such breakage.

Yep, I can confirm that it doesn't core dump any more with VirtualBox 1.6.2 on snv_92.

So I guess this bug can be closed.

comment:4 Changed 6 years ago by frank

  • Status changed from new to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use