VirtualBox

Ticket #16417 (new defect)

Opened 4 years ago

Last modified 3 years ago

vboxsf crashes process or VM in case of many file stat operations on shared vboxsf file system in a short time

Reported by: mrtngrnr Owned by:
Component: shared folders Version: VirtualBox 5.1.14
Keywords: Cc: dusan.vuckovic@…
Guest type: Linux Host type: Linux

Description

We use a vagrant based toolchain to automatically generate VMs for executing unit tests. They run on Ubuntu 16.04 LTS 64-bit hosts, and the VMs themselves contain various different older and newer Linux distributions. We use current VirtualBox 5.1.14 and also the 5.1.14 guest tools.

Quite a significant percentage (10-20%) of our VM deployments was not successful. In case of "default" paravirtualization using KVM, the whole VM freezes, and the VBox.log file just says:

00:00:42.027263 VMMDev: vmmDevHeartbeatFlatlinedTimer: Guest seems to be unresponsive. Last heartbeat received 4 seconds ago

If using the "minimal" paravirtualization, VM operation continues, just the unit test process itself dies (see log below). This happens consistently at the same place where there are many stat() calls made to get the ctime information of many thousands of (empty) cache files on a shared vboxsf file system, before outdated files are supposed to be cleaned up. If I remove this code part, the error does not occur any more. Therefore my (uneducated) assumption is that it could be related to running many stat operations in a short time. Here's the stack trace as found in /var/log/kern.log in the instance.

Jan 21 22:23:14 vagrant kernel: [    8.021389] vboxsf: Successfully loaded version 5.1.14 (interface 0x00010004)
Jan 21 22:23:14 vagrant kernel: [    8.328536] VBoxService 5.1.14 r112924 (verbosity: 0) linux.amd64 (Jan 16 2017 17:38:46) release log
Jan 21 22:23:14 vagrant kernel: [    8.328536] 00:00:00.000093 main     Log opened 2017-01-21T22:23:14.323839000Z
Jan 21 22:23:14 vagrant kernel: [    8.328566] 00:00:00.000154 main     OS Product: Linux
Jan 21 22:23:14 vagrant kernel: [    8.328578] 00:00:00.000170 main     OS Release: 3.16.0-4-amd64
Jan 21 22:23:14 vagrant kernel: [    8.328589] 00:00:00.000182 main     OS Version: #1 SMP Debian 3.16.39-1 (2016-12-30)
Jan 21 22:23:14 vagrant kernel: [    8.328614] 00:00:00.000193 main     Executable: /opt/VBoxGuestAdditions-5.1.14/sbin/VBoxService
Jan 21 22:23:14 vagrant kernel: [    8.328614] 00:00:00.000194 main     Process ID: 1028
Jan 21 22:23:14 vagrant kernel: [    8.328614] 00:00:00.000195 main     Package type: LINUX_64BITS_GENERIC
Jan 21 22:23:14 vagrant kernel: [    8.329284] 00:00:00.000860 main     5.1.14 r112924 started. Verbose level = 0
Jan 21 22:27:01 vagrant kernel: [  233.348410] VBOXGUEST_IOCTL_HGCM_CALL: 64 Failed. rc=-228.
Jan 21 22:27:01 vagrant kernel: [  233.348430] general protection fault: 0000 [#1] SMP 
Jan 21 22:27:01 vagrant kernel: [  233.348461] Modules linked in: vboxsf(O) nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc vboxvideo(O) ttm drm_kms_helper ppdev drm evdev vboxguest(O) crc32_pclmul i2c_piix4 pcspkr i2c_core serio_raw aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd parport_pc battery processor parport video thermal_sys ac button autofs4 ext4 crc16 mbcache jbd2 dm_mod sg sd_mod crc_t10dif crct10dif_generic ata_generic crct10dif_pclmul crct10dif_common crc32c_intel e1000 psmouse ata_piix libata scsi_mod
Jan 21 22:27:01 vagrant kernel: [  233.348614] CPU: 1 PID: 1925 Comm: otrs.UnitTest.p Tainted: G           O  3.16.0-4-amd64 #1 Debian 3.16.39-1
Jan 21 22:27:01 vagrant kernel: [  233.348661] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Jan 21 22:27:01 vagrant kernel: [  233.348689] task: ffff8800797bcce0 ti: ffff88007bdcc000 task.ti: ffff88007bdcc000
Jan 21 22:27:01 vagrant kernel: [  233.348718] RIP: 0010:[<ffffffffa02666c1>]  [<ffffffffa02666c1>] VGDrvCommonIoCtl+0x261/0x1bf0 [vboxguest]
Jan 21 22:27:01 vagrant kernel: [  233.348763] RSP: 0018:ffff88007bdcfb98  EFLAGS: 00010282
Jan 21 22:27:01 vagrant kernel: [  233.348784] RAX: 00000000ffffff1c RBX: 0000a0007bdcfc84 RCX: 0000000000000001
Jan 21 22:27:01 vagrant kernel: [  233.348813] RDX: 0000000000000000 RSI: 000000000000002f RDI: ffff88007bdcf9a0
Jan 21 22:27:01 vagrant kernel: [  233.348842] RBP: 5f7c81f4149bd010 R08: 000000000000000a R09: 00000000000001c0
Jan 21 22:27:01 vagrant kernel: [  233.348870] R10: 0000000000000000 R11: ffff88007bdcf6be R12: 0000b00000000000
Jan 21 22:27:01 vagrant kernel: [  233.348898] R13: 1b3eed7500000000 R14: 5f7c81f4149bd064 R15: 5f7c81f4149bd010
Jan 21 22:27:01 vagrant kernel: [  233.348927] FS:  00007ff073ca7700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
Jan 21 22:27:01 vagrant kernel: [  233.348958] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 22:27:01 vagrant kernel: [  233.348981] CR2: 00007ffec82d3ff8 CR3: 000000007bcd8000 CR4: 00000000000406e0
Jan 21 22:27:01 vagrant kernel: [  233.349008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 21 22:27:01 vagrant kernel: [  233.349033] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 21 22:27:01 vagrant kernel: [  233.349072] Stack:
Jan 21 22:27:01 vagrant kernel: [  233.349082]  001041ed149bd010 000003e900000000 00000002000003e9 013aa00300000902
Jan 21 22:27:01 vagrant kernel: [  233.349114]  0000000000000000 0000000000000000 ffff880023350880 0000006c00000004
Jan 21 22:27:01 vagrant kernel: [  233.349147]  ffff88007bdcfbfc 0000a00000020830 ffff88007b0c6000 ffffffffa03f877c
Jan 21 22:27:01 vagrant kernel: [  233.349178] Call Trace:
Jan 21 22:27:01 vagrant kernel: [  233.349194]  [<ffffffffa03f877c>] ? sf_stat+0x5c/0x110 [vboxsf]
Jan 21 22:27:01 vagrant kernel: [  233.349219]  [<ffffffffa03f88b0>] ? sf_inode_revalidate+0x80/0xd0 [vboxsf]
Jan 21 22:27:01 vagrant kernel: [  233.349246]  [<ffffffff811b5192>] ? __inode_permission+0x22/0xb0
Jan 21 22:27:01 vagrant kernel: [  233.349269]  [<ffffffff811b58e6>] ? link_path_walk+0x256/0x870
Jan 21 22:27:01 vagrant kernel: [  233.349292]  [<ffffffff811c281a>] ? __d_lookup+0x7a/0x150
Jan 21 22:27:01 vagrant kernel: [  233.349979]  [<ffffffffa03f8914>] ? sf_dentry_revalidate+0x14/0x20 [vboxsf]
Jan 21 22:27:01 vagrant kernel: [  233.350641]  [<ffffffff811b47de>] ? lookup_fast+0x25e/0x2b0
Jan 21 22:27:01 vagrant kernel: [  233.351302]  [<ffffffff811b6055>] ? path_lookupat+0x155/0x780
Jan 21 22:27:01 vagrant kernel: [  233.351927]  [<ffffffffa03fa6ee>] ? VbglR0SfCreate+0x7e/0xa0 [vboxsf]
Jan 21 22:27:01 vagrant kernel: [  233.352603]  [<ffffffff811b66a6>] ? filename_lookup+0x26/0xc0
Jan 21 22:27:01 vagrant kernel: [  233.353267]  [<ffffffff811ba8e4>] ? user_path_at_empty+0x54/0x90
Jan 21 22:27:01 vagrant kernel: [  233.353855]  [<ffffffff810ed71e>] ? from_kgid_munged+0xe/0x20
Jan 21 22:27:01 vagrant kernel: [  233.354431]  [<ffffffff811af53a>] ? cp_new_stat+0x13a/0x160
Jan 21 22:27:01 vagrant kernel: [  233.354976]  [<ffffffff811af0e6>] ? vfs_fstatat+0x46/0x90
Jan 21 22:27:01 vagrant kernel: [  233.355536]  [<ffffffff811af57a>] ? SYSC_newstat+0x1a/0x40
Jan 21 22:27:01 vagrant kernel: [  233.356067]  [<ffffffff8151adcd>] ? system_call_fast_compare_end+0x10/0x15
Jan 21 22:27:01 vagrant kernel: [  233.356577] Code: 01 b9 ff ff ff ff 48 89 da 4c 89 d6 4c 89 ef e8 b6 df ff ff 41 89 c7 0f 1f 00 48 83 c4 40 44 89 f8 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 66 0f 1f 44 00 00 49 83 f8 17 0f 86 34 0c 00 00 48 8d 83 00 
Jan 21 22:27:01 vagrant kernel: [  233.358611] RIP  [<ffffffffa02666c1>] VGDrvCommonIoCtl+0x261/0x1bf0 [vboxguest]
Jan 21 22:27:01 vagrant kernel: [  233.359118]  RSP <ffff88007bdcfb98>
Jan 21 22:27:01 vagrant kernel: [  233.359636] ---[ end trace 54b9778e1a5d744b ]---

For reference, here's the Perl code that is used to cleanup the (empty cache files):

my $Wanted = sub {

    # Skip nonregular files and directories.
    return if ( !-f $File::Find::name );
    my $Stat = File::stat::stat($File::Find::name);
    if ( $Stat && ( time() - $Stat->ctime() > $CacheTTLSeconds ) ) {    ## no critic
        unlink $File::Find::name;    # Don't die here, this could be removed in the background by another process too.
    }
};
File::Find::find( $Wanted, $CacheDir );

Less often, a similar error occurs during the vagrant provisioning of the instances, where we copy the source code of our project from the /vagrant shared vboxsf folder into the VM. Here just a simple "cp -r" is used during the deployment. It could be the same problem:

==> default: Copying OTRS code to /opt/otrs
==> default: cp: cannot stat ‘/vagrant/Sources/otrs/var/httpd/htdocs/js/thirdparty/ckeditor-4.5.6/plugins/mathjax/lang/cy.js’: Protocol error

Please let me know if I should try to provide additional information, and thanks a lot for taking a look at this.

Attachments

ubuntu-xenial-16.04-cloudimg-console.log Download (47.2 KB) - added by Thi 3 years ago.
Vagrant log file
VBox.log Download (73.2 KB) - added by Thi 3 years ago.
VirtualBox log file
1500480023.png Download (95.1 KB) - added by PabloViquez 3 years ago.
Machine2
Mac15_retina.png Download (99.4 KB) - added by PabloViquez 3 years ago.
Machine1
ubuntu-xenial-16.04-cloudimg-console.2.log Download (43.4 KB) - added by PabloViquez 3 years ago.
Error Log
ubuntu-xenial-16.04-cloudimg-console.3.log Download (39.1 KB) - added by PabloViquez 3 years ago.
Machine 3, Linux ubuntu-xenial 4.10.0-24-generic #28~16.04.1-Ubuntu SMP Wed Jun 14 11:58:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
vboxsf.ko Download (68.6 KB) - added by PabloViquez 3 years ago.
lib/modules/4.10.0-24-generic/kernel/ubuntu/vbox/vboxsf/vboxsf.ko

Change History

comment:1 Changed 4 years ago by mrtngrnr

I forgot to mention that we use the following mount options for the vboxsf folders:

mount_options: ['umask=000','fmode=664','dmode=775']

comment:2 Changed 3 years ago by frank

Definitely interesting but perhaps there is a reproduction scenrio which does not need a vagrant toolchain.

comment:3 Changed 3 years ago by frank

Oh, and I would still appreciate a VBox.log file of such a VM which hangs or crashes under heavy load...

Changed 3 years ago by Thi

Vagrant log file

Changed 3 years ago by Thi

VirtualBox log file

comment:4 Changed 3 years ago by Thi

Hello,

Same kernel panic here, also while using Vagrant.

See Vagrant and VirtualBox logs in ticket's attachements.

comment:5 Changed 3 years ago by frank

Thi, thanks for the logs! Could you also provide the vboxsf.ko module from the guest where the crash happens? Thank you!

Changed 3 years ago by PabloViquez

Machine2

Changed 3 years ago by PabloViquez

Machine1

Changed 3 years ago by PabloViquez

Error Log

comment:6 Changed 3 years ago by PabloViquez

It started to happen to me as well.

I have 2 Mac machines running Virtual Box 5.1.22, one one of them is experiencing the issue.

Machine 1 (do not have the issue) Machine1

Machine 2 (Does experience the issue) Machine2

Error Log: Error Log

[ 3208.333482] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc0362b1d
[ 3208.333482]
[ 3208.388327] CPU: 0 PID: 8859 Comm: php Not tainted 4.4.0-83-generic #106-Ubuntu
[ 3208.442679] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 3208.488888]  0000000000000086 0000000005648689 ffff88007b947a10 ffffffff813f9513
[ 3208.590333]  ffffffff81cba598 ffff88007b947aa8 ffff88007b947a98 ffffffff8118d607
[ 3208.694710]  ffffffff00000010 ffff88007b947aa8 ffff88007b947a40 0000000005648689
[ 3208.771867] Call Trace:
[ 3208.797156]  [<ffffffff813f9513>] dump_stack+0x63/0x90
[ 3208.850053]  [<ffffffff8118d607>] panic+0xd3/0x215
[ 3208.916594]  [<ffffffffc0362b1d>] ? sf_stat+0xfd/0x100 [vboxsf]
[ 3208.988018]  [<ffffffffc03a58a5>] ? rtR0MemFree+0x55/0x70 [vboxguest]
[ 3209.054548]  [<ffffffff81081119>] __stack_chk_fail+0x19/0x20
[ 3209.117923]  [<ffffffffc0362b1d>] sf_stat+0xfd/0x100 [vboxsf]
[ 3209.184369]  [<ffffffffc0364c4e>] ? VbglR0SfCreate+0x8e/0xb0 [vboxsf]
[ 3209.205510]  [<ffffffffc0362c13>] ? sf_dentry_revalidate+0x13/0x30 [vboxsf]
[ 3209.281477]  [<ffffffff8121a3af>] ? lookup_fast+0x24f/0x330
[ 3209.341436]  [<ffffffff8121ba79>] ? walk_component+0x49/0x300
[ 3209.406658]  [<ffffffff8121d001>] ? link_path_walk+0x191/0x5b0
[ 3209.450785]  [<ffffffff8121b46b>] ? path_init+0x1eb/0x3c0
[ 3209.480281]  [<ffffffff8121d51c>] ? path_lookupat+0x7c/0x110
[ 3209.557010]  [<ffffffff8121f141>] ? filename_lookup+0xb1/0x180
[ 3209.609035]  [<ffffffff811ed747>] ? kmem_cache_alloc+0x187/0x1f0
[ 3209.661132]  [<ffffffff8121ed46>] ? getname_flags+0x56/0x1f0
[ 3209.701208]  [<ffffffff8121f2e6>] ? user_path_at_empty+0x36/0x40
[ 3209.789207]  [<ffffffff81214076>] ? vfs_fstatat+0x66/0xc0
[ 3209.853415]  [<ffffffff81214631>] ? SYSC_newlstat+0x31/0x60
[ 3209.892152]  [<ffffffff8121476e>] ? SyS_newlstat+0xe/0x10
[ 3209.942207]  [<ffffffff81840b72>] ? entry_SYSCALL_64_fastpath+0x16/0x71
[ 3210.000542] Kernel Offset: disabled
[ 3210.070708] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc0362b1d
[ 3210.070708]
Last edited 3 years ago by PabloViquez (previous) (diff)

Changed 3 years ago by PabloViquez

Machine 3, Linux ubuntu-xenial 4.10.0-24-generic #28~16.04.1-Ubuntu SMP Wed Jun 14 11:58:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Changed 3 years ago by PabloViquez

lib/modules/4.10.0-24-generic/kernel/ubuntu/vbox/vboxsf/vboxsf.ko

comment:7 Changed 3 years ago by PabloViquez

Same issue after upgrading kernel, this time 4.10.0-24-generic

Virtual Box version: 5.1.22

Ubuntu 16.04.2 LTS ubuntu-xenial ttyS0

ubuntu-xenial login: [  598.928632] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc04e0863
[  598.928632] 
[  598.984839] CPU: 0 PID: 4948 Comm: php Not tainted 4.10.0-24-generic #28~16.04.1-Ubuntu
[  599.008107] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  599.048215] Call Trace:
[  599.061841]  dump_stack+0x63/0x90
[  599.072364]  panic+0xe4/0x22d
[  599.084014]  ? VbglR0HGCMInternalCall+0xe83/0xe90 [vboxguest]
[  599.103309]  ? rtR0MemFree+0x55/0x70 [vboxguest]
[  599.125034]  __stack_chk_fail+0x19/0x20
[  599.136182]  VbglR0HGCMInternalCall+0xe83/0xe90 [vboxguest]
[  599.159284]  ? vgdrvHgcmAsyncWaitCallbackWorker+0x220/0x220 [vboxguest]
[  599.186149]  ? vbglDriverIOCtl+0x16/0x20 [vboxsf]
[  599.195686]  ? VbglHGCMCall+0x43/0x90 [vboxsf]
[  599.220713]  ? VbglR0SfCreate+0x8e/0xb0 [vboxsf]
[  599.230384]  ? sf_stat+0x5d/0x100 [vboxsf]
[  599.245906]  ? VGDrvCommonIoCtl+0x263/0x1bf0 [vboxguest]
[  599.272188]  ? VBoxGuestIDCCall+0x64/0x70 [vboxguest]
[  599.288242]  ? vbglDriverIOCtl+0x16/0x20 [vboxsf]
[  599.307753]  ? VbglHGCMCall+0x43/0x90 [vboxsf]
[  599.312484]  ? VbglR0SfCreate+0x8e/0xb0 [vboxsf]
[  599.325318]  ? sf_stat+0x5d/0x100 [vboxsf]
[  599.328750]  ? sf_inode_revalidate+0x8c/0xe0 [vboxsf]
[  599.344254]  ? __check_object_size+0x108/0x1e3
[  599.353403]  ? strncpy_from_user+0x4d/0x170
[  599.355276]  ? sf_getattr+0x1e/0x40 [vboxsf]
[  599.360343]  ? apparmor_inode_getattr+0x1d/0x20
[  599.376027]  ? vfs_getattr_nosec+0x29/0x40
[  599.383014]  ? vfs_getattr+0x26/0x30
[  599.384696]  ? vfs_fstatat+0x78/0xc0
[  599.401083]  ? SYSC_newstat+0x2e/0x60
[  599.413682]  ? SyS_mmap_pgoff+0xf8/0x290
[  599.428594]  ? SyS_newstat+0xe/0x10
[  599.435137]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
[  599.459465] Kernel Offset: 0x39a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  599.462856] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc04e0863
[  599.462856] 

Attaching:

  1. vboxsf.ko lib/modules/4.10.0-24-generic/kernel/ubuntu/vbox/vboxsf/vboxsf.ko
  2. ubuntu-xenial-16.04-cloudimg-console.3.log Machine 3, Linux ubuntu-xenial 4.10.0-24-generic #28~16.04.1-Ubuntu SMP Wed Jun 14 11:58:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Note: See TracTickets for help on using tickets.

www.oracle.com
ContactPrivacy policyTerms of Use