VirtualBox

Opened 4 years ago

Last modified 3 years ago

#19133 assigned defect

VM lockup when guest opens a named pipe in a shared folder

Reported by: lesha Owned by:
Component: shared folders Version: VirtualBox 6.0.14
Keywords: named pipe, fifo, lockup Cc:
Guest type: all Host type: all

Description (last modified by Frank Batschulat (Oracle))

I discovered this bug because I was rsyncing a shared folder containing a named pipe, and this hung my machine.

Repro steps:

  • Set up a Linux guest on a Mac host. I have Ubuntu 18.04 running on VBox 6.0.14 (latest stable as of now).
  • Set up a shared folder
  • On the host, mkfifo TEST_PIPE in the folder
  • On the guest, cat TEST_PIPE

At this point, the guest is (partially) locked up. Specifically:

  • Ctrl-C nor SIGKILL will work for cat — it is hung in D state, unkillable
  • Any process touching the shared folder hangs likewise
  • dmesg will show the following stack for the hung cat https://pastebin.com/WjKQaZys [1]
  • The guest cannot be cleanly rebooted because shutdown requires tearing down the guest additions module, which is blocked on talking to the VM.
  • Powering off the VM (not from the guest, but from the UI!) will hang.

So this is not a hang in guest additions, but in the VM code itself.

My guess is that the thread handling shared folders is locked up trying to read from the named pipe.

What confirms this is that the hang is resolved the moment that I do echo > TEST_PIPE on the host.

The reason this is not expected behavior is that the guest sees the named pipe as a regular file:

$ ls -l TEST_PIPE -rwxrwx--- 1 root vboxsf 0 Dec 6 16:21 TEST_PIPE

Two fixes seem possible:

  • Make named pipes act as true named pipes to the host. If you google "vboxsf named pipe", this is actually a feature that was previously requested.
  • Make the open syscall fail in this context. This is worse than working pipes, but better than a lockup. At the moment, data does not actually travel down the named pipe, if the host does echo foo > TEST_PIPE and the guest does cat TEST_PIPE, the guest simply says cat: TEST_PIPE: Protocol error.

[1]

$ sudo cat /proc/7376/stack
[<0>] rtR0SemEventMultiLnxWait.isra.2+0x33d/0x370 [vboxguest]
[<0>] VBoxGuest_RTSemEventMultiWaitEx+0xe/0x10 [vboxguest]
[<0>] VBoxGuest_RTSemEventMultiWait+0x28/0x30 [vboxguest]
[<0>] vgdrvHgcmAsyncWaitCallbackWorker+0x1c3/0x210 [vboxguest]
[<0>] VGDrvCommonIoCtl+0x489/0x18e0 [vboxguest]
[<0>] VBoxGuestIDC+0x149/0x160 [vboxguest]
[<0>] VbglR0IdcCallRaw+0x13/0x20 [vboxsf]
[<0>] VbglR0HGCMFastCall+0x1c/0x20 [vboxsf]
[<0>] vbsf_reg_open+0x291/0x4f0 [vboxsf]
[<0>] do_dentry_open+0x1c2/0x310
[<0>] vfs_open+0x4f/0x80
[<0>] path_openat+0x6bf/0x1900
[<0>] do_filp_open+0x9b/0x110
[<0>] do_sys_open+0x1bb/0x2c0
[<0>] SyS_openat+0x14/0x20
[<0>] do_syscall_64+0x73/0x130
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff

Change History (10)

comment:1 by Frank Batschulat (Oracle), 4 years ago

Owner: set to Frank Batschulat (Oracle)
Status: newaccepted

thanks for this excellent bug report!

THe fix strategy is likely going to be to show the hosts file as a pipe (currently it is shown as a regular file) but do not support any actvity on such a pipe file from the guest side and fail such attempts in the guest with ENOTSUP.

Last edited 4 years ago by Frank Batschulat (Oracle) (previous) (diff)

comment:3 by Frank Batschulat (Oracle), 4 years ago

Description: modified (diff)

comment:4 by Frank Batschulat (Oracle), 4 years ago

Guest type: Linuxall
Host type: Mac OS Xall

comment:5 by Frank Batschulat (Oracle), 4 years ago

I could reproduce this on Solaris & Linux guests so far and on MacOS -X and Linux hosts, almost likely all platforms are affected.

comment:6 by Frank Batschulat (Oracle), 4 years ago

Summary: Mac-hosted VM lockup when guest opens a named pipe in a shared folderVM lockup when guest opens a named pipe in a shared folder

comment:7 by Frank Batschulat (Oracle), 4 years ago

to picture the description in a bit more detail, that's what happens when you attempt to use the named pipe/fifo from the host inside the guest:

[fbatschu@localhost sf_Music]$ ls -la /media/sf_Music/TEST_PIPE
-rwxrwx---. 1 root vboxsf 0 Aug 10 14:53 /media/sf_Music/TEST_PIPE

[fbatschu@localhost sf_Music]$ file TEST_PIPE
TEST_PIPE: empty

[fbatschu@localhost sf_Music]$ stat TEST_PIPE
  File: TEST_PIPE
  Size: 0         	Blocks: 0          IO Block: 1048576 regular empty file
Device: 27h/39d	Inode: 128         Links: 1
Access: (0770/-rwxrwx---)  Uid: (    0/    root)   Gid: (  975/  vboxsf)
Context: system_u:object_r:vmblock_t:s0
Access: 2020-08-10 14:53:46.069672000 +0200
Modify: 2020-08-10 14:53:46.069672000 +0200
Change: 2020-08-10 14:53:46.069672000 +0200
 Birth: 2020-08-10 14:53:46.069672000 +0200

### now open the named pipe/fifo file:

[fbatschu@localhost sf_Music]$ cat TEST_PIPE

### now process is stuck in the guest, cannot CTRL+C the process.

[fbatschu@localhost ~]$ sudo cat /proc/3274/stack
[<0>] rtR0SemEventMultiLnxWait.isra.0+0x2e2/0x3b0 [vboxguest]
[<0>] vgdrvHgcmAsyncWaitCallbackWorker+0xcf/0x220 [vboxguest]
[<0>] VGDrvCommonIoCtl+0x47f/0x1900 [vboxguest]
[<0>] VBoxGuestIDC+0x113/0x130 [vboxguest]
[<0>] vbsf_reg_open+0x23a/0x4b0 [vboxsf]
[<0>] do_dentry_open+0x13a/0x380
[<0>] path_openat+0x998/0xfb0
[<0>] do_filp_open+0x7e/0xd0
[<0>] do_sys_openat2+0x1f1/0x2a0
[<0>] do_sys_open+0x34/0x60
[<0>] do_syscall_64+0x5b/0x1c0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

### write something on the host into the named pipe/fifo:

fbatschu@lserver:~/Music$ echo bla > TEST_PIPE

### and the cat in the guest which has the named pipe/fifo file ### open fails with an error

[fbatschu@localhost sf_Music]$ cat TEST_PIPE
cat: TEST_PIPE: Protocol error

### from inside the guest you cannot create a named pipe/fifo ### file in the shared folder file system:

[fbatschu@localhost sf_Music]$ mkfifo TEST_PIPE_2
mkfifo: cannot create fifo 'TEST_PIPE_2': Operation not permitted

### guest writes data to named pipe/fifo file:

[fbatschu@localhost sf_Music]$ echo "guest" > TEST_PIPE
-bash: TEST_PIPE: Invalid argument
[fbatschu@localhost sf_Music]$ echo $?
1

### host process reading from named pipe/fifo unblocks ### but receives no data:

fbatschu@lserver:~/Music$ cat TEST_PIPE
fbatschu@lserver:~/Music$ echo $?
0
Last edited 4 years ago by Frank Batschulat (Oracle) (previous) (diff)

comment:8 by Frank Batschulat (Oracle), 4 years ago

So we have the following situation:

1) the named pipe/fifo special file in the host is presented as a regular file in the guests shared folder file system.

2) you cannot create a named pipe/fifo special file in the shared folder file system in the guest.

3) in the guest, the shared folder file system behaves like the

file would be a named pipe/fifo even though it appears not to be one, ie. when reading, we have blocking access to that file until the other end of the named pipe /fifo in the host file system actually writes data to it which unblocks the guest access to that file again, ie. the behavior of an ordinary named pipe/fifo.

4) although 3) appears to be the named pipe behavior it is actually not really. once the hosts sending end of the named pipe/fifo has written something into the named pipe/fifo special file, the guest unblocks but fails with an error "Protocol error" and no actual data did travel from the hosts named pipe sending end to the guest named pipe receiving end.

5) writing data in the guests shared file system to this "regular|/named pipe file blocks until there is a reader on the other end of this named pipe in the host side. Once there is a reader at the host end of the named pipe/fifo, the guests writing process unblocks,ie. like in 3) we sort of have the behavior of an ordinary named pipe/fifo.

6) although 5) appears to be named pipe/fifo behavior it is actually not really. once the receiving end of the named pipe on the host is connected the guest process fails the write to the named pipe/fifo special file with EINVAL and no actual data did travel to the hosts process reading from the named pipe/fifo.

7) when a process in the guest reads or writes to that file in the shared folder file system that is supposed to be a named pipe/fifo special file (from the hosts perspective at least), you cannot interrupt this process with CTRL+C or by a signal, eg. kill -15, anymore when there is no process on the other (host) side end of the named pipe/fifo. That behavior is not the case for process doing the same in the regular file system.

Last edited 4 years ago by Frank Batschulat (Oracle) (previous) (diff)

comment:9 by Frank Batschulat (Oracle), 4 years ago

So whether or not the current implementation is a non-working attempt to also deal with named pipes between the host and the guest or was never intended for such host/guest IPC at all, we need to fix it somehow.

The current fix approach will be:

1) host: deny any open() or mmap() of a host side file of type pipe,

return ENOTSUP to the guests attempt

2) host: return file attributes to the guest for such a file to reflect

the type pipe to the guest

3) guest: show the file correctly as being a pipe as can be seen in any other

Linux file system, ie:

$ ls -lisa fifo
14287767 0 prw-r--r-- 1 fbatschu staff 0 Sep 14 12:32 fifo
$ stat fifo
  File: fifo
  Size: 0         	Blocks: 0          IO Block: 4096   fifo
Device: 801h/2049d	Inode: 14287767    Links: 1
Access: (0644/prw-r--r--)  Uid: ( 1000/fbatschu)   Gid: (   50/   staff)
Access: 2020-09-14 12:32:35.993088487 +0200
Modify: 2020-09-14 12:32:35.993088487 +0200
Change: 2020-09-14 12:32:35.993088487 +0200
 Birth: -

4) guest: fail an attempt to open() or mmap() such file with ENOTSUP,

need to figure out whether or not we channel the host error message through to the guest or if we add a short cut directly in the guests vboxsf

5) guest: change the error return value when the guest is attempting

to create a pipe from EPERM (Operation not permitted) to ENOTSUP (Operation not supported). Afterall this is not a permission failure and not a question about missing privileges, rather we do not support such operation on vboxsf at all.

comment:10 by Frank Batschulat (Oracle), 4 years ago

Owner: Frank Batschulat (Oracle) removed
Status: acceptedassigned

comment:11 by incident41, 3 years ago

Just adding that I also hit this one, Oracle Linux 7U9 host running VirtualBox 6.1.22_144080, Windows 10 guest, VM hang upon performing shared folder operations with a named pipe in the root of the shared folder.

When the Windows VM hung, only option was to kill -9 the VirtualBox process at host level.

pstack of VirtualBox from host showed shared folder thread stuck in open64(), some juggling and strace'ing showed the named pipe being the argument of open64().

Removing the named pipe made all hangs disappear at once.

The part I can't quite explain is what changed to cause this lockup behavior, since the named pipe file was dated 2012 and I carry over the filesystem from laptop to laptop, using my Windows VM at least a couple times a week; lockups started somewhere between end of May and beginning of June 2021.

Note: See TracTickets for help on using tickets.

© 2023 Oracle
ContactPrivacy policyTerms of Use