VirtualBox

Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#16677 closed defect (fixed)

A reliable way to hang AioMgr0-N under certain conditions

Reported by: deletedgmailduetoJamesDamore_firing Owned by:
Component: virtual disk Version: VirtualBox 5.1.18
Keywords: Cc:
Guest type: Linux Host type: Linux

Description (last modified by Frank Mehnert)

A reliable way to hang AioMgr0-N when 'fs readahead'(hdparm -a) is on (and is 256(KiB)) but only when 'Use Host I/O Cache' is disabled in virtualbox 5.1.18 r114002 (currently latest available) on Arch Linux 64bit, host OS. Basically, reading from guest SATA HDD and/or IDE CD will hang AioMgr0-N(until killed) on host OS and guest OS will fail to read/boot etc.

This seems to be virtualbox settings-agnostic, except for the 'Use Host I/O Cache' which has to be disabled for this to occurr AND guest OS readahead has to be enabled (usually set to 256KiB; able to repro with 16KiB too, with different dd 'skip' settings). So, this doesn't ocurr when hdparm -a0 /dev/sda was executed, or any value between 0 and 7, inclusive, (keeps readahead 'off')

Note:

hdparm -a0 /dev/sda

seems to do the same as

# echo 0 > /sys/class/block/sda/queue/read_ahead_kb

Steps to reproduce this issue on a guest HDD/vdi (as opposed to a guest CD/IDE/iso which is in the 'next section'):

  1. get archlinux-2017.03.01-dual.iso (or any current/latest iso, BUT then the dd repro will most likely not work at the same 'skip'!)

However according to this comment from frank https://www.virtualbox.org/ticket/15281#comment:8 , I shouldn't be posting any such links here, but then reproducing this(with accuracy) is going to be really hard without them, so I'm going to be obscuring them with base64, hopefully that's a good compromise between not posting them and posting them...

Pick a download location(476.0 MB):

archlinux webpage with torrent/magnet links, located here:

aHR0cHM6Ly93d3cuYXJjaGxpbnV4Lm9yZy9yZWxlbmcvcmVsZWFzZXMvMjAxNy4wMy4wMS8=

(my)google drive, directory with the iso file:

aHR0cHM6Ly9kcml2ZS5nb29nbGUuY29tL29wZW4/aWQ9MEI2UC05VXlKRmp6S1ZsWkhUa1E0YUZWbGRUUQ==

sha1sum for that iso is:

f426866ca632a35a3eeae8e4080cff25ec8da614  archlinux-2017.03.01-dual.iso
  1. get my VM the reproduction steps only work with my VM with that specific vdi file; if that vdi gets cloned(even if to same format: vdi) it will not reproduce anymore! The IDE CD repro. steps will still work and make it hang reliably though (see 'next section')

Pick one download location(1.9G):

dropbox:

aHR0cHM6Ly93d3cuZHJvcGJveC5jb20vc2gvaHFiMGIwNDB4aTYyMDN2L0FBRE50RVF5Z0tGU1duNnpRaDNZS1VVcmE/ZGw9MA==

(my)google drive:

aHR0cHM6Ly9kcml2ZS5nb29nbGUuY29tL2RyaXZlL2ZvbGRlcnMvMEI2UC05VXlKRmp6S1Rrc3lZM0pZWlhSbU1Xcz91c3A9c2hhcmluZw==

download these 3 files: arch3.vbox, arch3.vdi, checksums

md5,sha1,sha256 and sha512 for those two are:

1aa8124771fd7c8761435ff26ee2d55b  ./arch3.vbox
f15c97a939cda4c6bdd413fb52a9519d  ./arch3.vdi
9c1a10c0ab0eb945b677faee22ef9ab2d1ba8511  ./arch3.vbox
a9ed9c0c3542b9d174a4b3fa2b89d1fd72f249f3  ./arch3.vdi
7ac7315a292311f19f59af86bbba687885813d8da7a3f263a0bb2031369fe1f1  ./arch3.vbox
d6141ff8690d1c114fd2e8067107473e456a8b30e196fac17d601f1da564f676  ./arch3.vdi
57742f80911f72186fe6de168e310da84e9d6f39c5683b3f7669abfdf57708f1965b923d5b7462d4af82dc75ddd83ea5cbbcef27330b833f8dabe7da7e5a0d9d  ./arch3.vbox
1e752ad68de113d41999591d105798acaff130716ec81f88325fc9474e160e761eef305fd87885bdf500f1e400e3ec31f32c43c4b5a924304f4027133611e880  ./arch3.vdi
  1. TODO: "import" the .vbox somehow in the VirtualBox UI! (export/import wouldn't have worked because it clones my vdi; and imports it as vmdk)
  2. ensure 'Controller: IDE' has 'Use Host I/O Cache' enabled (it does by default when creating new VM) this means CD access won't hang
  3. ensure 'Controller: SATA' has 'Use Host I/O Cache' disabled (it's disabled by default when creating new VM) this means HDD access will hang, below.
  4. start, Boot from CD/iso, you see grub "Boot Arch Linux (x86_64)", press Enter to boot from the iso (do not boot HDD's OS, it will hang during boot sequence)
  5. you get to the root prompt, auto logged in.

info:

This is the default, 256KiB of readahead:

# hdparm -a /dev/sda



/dev/sda:
 readahead     = 256 (on)

and

# hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
	Model Number:       VBOX HARDDISK                           
	Serial Number:      VB78ecd1fa-712db085 
	Firmware Revision:  1.0     
Standards:
	Used: ATA/ATAPI-6 published, ANSI INCITS 361-2002 
	Supported: 6 5 4 
Configuration:
	Logical		max	current
	cylinders	16383	16383
	heads		16	16
	sectors/track	63	63
	--
	CHS current addressable sectors:    16514064
	LBA    user addressable sectors:    25165824
	LBA48  user addressable sectors:    25165824
	Logical/Physical Sector size:           512 bytes
	device size with M = 1024*1024:       12288 MBytes
	device size with M = 1000*1000:       12884 MBytes (12 GB)
	cache/buffer size  = 256 KBytes (type=DualPortCache)
Capabilities:
	LBA, IORDY(cannot be disabled)
	Queue depth: 32
	Standby timer values: spec'd by Vendor, no device specific minimum
	R/W multiple sector transfer: Max = 128	Current = 128
	DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
	     Cycle time: min=120ns recommended=120ns
	PIO: pio0 pio1 pio2 pio3 pio4 
	     Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
	Enabled	Supported:
	   *	Power Management feature set
	   *	Write cache
	   *	Look-ahead
	   *	48-bit Address feature set
	   *	Mandatory FLUSH_CACHE
	   *	FLUSH_CACHE_EXT
	   *	Gen2 signaling speed (3.0Gb/s)
	   *	Native Command Queueing (NCQ)
Checksum: correct
  1. how to hang (boot session 3 is most interesting, below)

No hang when the following: ----

boot session 1:

dd if=/dev/sda of=/dev/null bs=512 skip=42208 count=8

boot session 2:

dd if=/dev/sda of=/dev/null bs=512 skip=42216 count=2

boot session 3:

dd if=/dev/sda of=/dev/null bs=512 skip=42215 count=1
dd if=/dev/sda of=/dev/null bs=512 skip=42216 count=1

Hang when the following: ----

boot session 1:

dd if=/dev/sda of=/dev/null bs=512 skip=42208 count=9

boot session 2:

dd if=/dev/sda of=/dev/null bs=512 skip=42215 count=2

boot session 3:

dd if=/dev/sda of=/dev/null bs=512 skip=42215 count=2

To exit the hang, just kill the PID of AioMgr0-N, eg. in 'top', press Shift+H to show threads names (maybe pressing 'c' is needed to toggle between cmdline/names, depending on your 'top' settings) then see the PID of AioMgr0-N, it should be at the top of the list using CPU (if sorted by %CPU, press x to see sort column) it's using about 22% constantly, unless you have set 3 CPUs in VM, then it's 100% (aka 1 core used fully) That kills the VM window, but not the VirtualBox UI window, which is what I want anyway.

What you see in guest's dmesg after the hang:

[   16.159742] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s3: link becomes ready
[  291.480202] ata1.00: exception Emask 0x0 SAct 0x1000000 SErr 0x0 action 0x6 frozen
[  291.480544] ata1.00: failed command: READ FPDMA QUEUED
[  291.480821] ata1.00: cmd 60/20:c0:e8:a4:00/00:00:00:00:00/40 tag 24 ncq dma 16384 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  291.481437] ata1.00: status: { DRDY }
[  291.481653] ata1: hard resetting link
[  291.792860] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  291.793245] ata1.00: configured for UDMA/133
[  291.793251] ata1.00: device reported invalid CHS sector 0
[  291.793270] ata1: EH complete
[  323.906810] ata1.00: exception Emask 0x0 SAct 0x8000000 SErr 0x0 action 0x6 frozen
[  323.907248] ata1.00: failed command: READ FPDMA QUEUED
[  323.907515] ata1.00: cmd 60/20:d8:e8:a4:00/00:00:00:00:00/40 tag 27 ncq dma 16384 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  323.908164] ata1.00: status: { DRDY }
[  323.908868] ata1: hard resetting link
[  324.223058] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  324.224317] ata1.00: configured for UDMA/133
[  324.224323] ata1.00: device reported invalid CHS sector 0
[  324.224342] ata1: EH complete
[  354.626843] ata1.00: exception Emask 0x0 SAct 0x4000 SErr 0x0 action 0x6 frozen
[  354.628012] ata1.00: failed command: READ FPDMA QUEUED
[  354.628631] ata1.00: cmd 60/20:70:e8:a4:00/00:00:00:00:00/40 tag 14 ncq dma 16384 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  354.631160] ata1.00: status: { DRDY }
[  354.631854] ata1: hard resetting link
[  354.942422] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  354.943605] ata1.00: configured for UDMA/133
[  354.943613] ata1.00: device reported invalid CHS sector 0
[  354.943637] ata1: EH complete
[  387.053504] ata1.00: NCQ disabled due to excessive errors
[  387.053508] ata1.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x6 frozen
[  387.054265] ata1.00: failed command: READ FPDMA QUEUED
[  387.054859] ata1.00: cmd 60/20:08:e8:a4:00/00:00:00:00:00/40 tag 1 ncq dma 16384 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  387.057247] ata1.00: status: { DRDY }
[  387.057842] ata1: hard resetting link
[  387.373510] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  387.374544] ata1.00: configured for UDMA/133
[  387.374553] ata1.00: device reported invalid CHS sector 0
[  387.374588] ata1: EH complete
[  419.480161] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  419.481049] ata1.00: failed command: READ DMA
[  419.481819] ata1.00: cmd c8/00:20:e8:a4:00/00:00:00:00:00/e0 tag 19 dma 16384 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  419.485077] ata1.00: status: { DRDY }
[  419.485813] ata1: hard resetting link
[  419.799146] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[  419.799713] ata1.00: configured for UDMA/133
[  419.799719] ata1.00: device reported invalid CHS sector 0
[  419.799736] ata1: EH complete
[  451.906813] ata1: limiting SATA link speed to 1.5 Gbps
[  451.906817] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  451.907568] ata1.00: failed command: READ DMA
[  451.908150] ata1.00: cmd c8/00:20:e8:a4:00/00:00:00:00:00/e0 tag 6 dma 16384 in
                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[  451.911967] ata1.00: status: { DRDY }
[  451.912716] ata1: hard resetting link
[  452.225783] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[  452.226126] ata1.00: configured for UDMA/133
[  452.226126] ata1.00: device reported invalid CHS sector 0
[  452.226126] sd 0:0:0:0: [sda] tag#6 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[  452.226126] sd 0:0:0:0: [sda] tag#6 Sense Key : 0x5 [current] 
[  452.226127] sd 0:0:0:0: [sda] tag#6 ASC=0x21 ASCQ=0x4 
[  452.226127] sd 0:0:0:0: [sda] tag#6 CDB: opcode=0x28 28 00 00 00 a4 e8 00 00 20 00
[  452.226127] blk_update_request: I/O error, dev sda, sector 42216
[  452.227928] ata1: EH complete

In the real world when 'Use Host I/O Cache' is disabled:

  • for the VM's IDE CD drive: I cannot boot from the ArchLinux or Manjaro iso (it will hang similarily)
  • for the VM's SATA HDD drive: cannot boot existing arch linux install

(also seen this on other guest linux distros; it's not an ArchLinux thing)


next section:

For hanging when reading /dev/sr0 aka 'the guest OS CD drive with archlinux-2017.03.01-dual.iso', when booting the OS from HDD,

two ways to reproduce:

Note: You cannot reproduce if readahead is off, eg. hdparm -a0 /dev/sr0

(also note, that -a256 being the default, the same dd 'skip' values from below will not work for a different readahead value)

way 1.

  1. to make HDD not hang AioMgr, clone it first (this magically gets rid of any AioMgr0-N hangs for some reason)

$ VBoxManage clonehd --format VDI arch3.vdi /tmp/cloned.vdi (obvious: to clone, first remove arch3.vdi from the VM, clone, then add cloned.vdi to the VM)

  1. set both IDE and SATA to have disabled 'Use Host I/O Cache'

c) boot from HDD(it boots from iso, at grub menu choose second option which is: 'Boot existing OS'); Don't boot iso's OS because it hangs instantly at Loading boot/x86_64/vmlinuz... and hangs AioMgr1-N with 99.9% cpu usage until you kill PID of AioMgr1-N which closes that VM's window.

way 2.

  1. if you didn't clone the vdi
  2. set only IDE(where iso is) on disbled 'Use Host I/O Cache', but let SATA(where vdi is) on enabled 'Use Host I/O Cache'
  3. boot, it will boot from iso, choose 'Boot existing OS' to boot from HDD(vdi)

way1&2 commons:

  1. at root prompt(you have to login as 'root', password is 'a'):

info:

[root@vm ~]# hdparm -a /dev/sr0

/dev/sr0:
 readahead     = 256 (on)
[root@vm ~]# hdparm -A /dev/sr0

/dev/sr0:
 HDIO_DRIVE_CMD(identify) failed: Input/output error

No hang when the following:

boot session 1:

dd if=/dev/sr0 of=/dev/null bs=512 skip=88038 count=2
dd if=/dev/sr0 of=/dev/null bs=512 skip=88040 count=2

boot session 2:

dd if=/dev/sr0 of=/dev/null bs=512 skip=88038 count=1
dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=1
dd if=/dev/sr0 of=/dev/null bs=512 skip=88040 count=1

Hang when the following:

boot session 1:

# dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=2

boot session 2:

# dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=2

way1:

# time dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=2

dd: error reading '/dev/sr0': Input/output error
1+0 records in
1+0 records out
512 bytes copied, 101.679 s, 0.0 kB/s

real	1m41.692s
user	0m0.000s
sys	0m0.007s

top:

AioMgr1-N (yep, 1 not 0) constant CPU usage of 19.6% -> 21.6% ...even after dd finishes!

dmesg:

[   89.179123] systemd-logind[196]: New session c2 of user root.
[  129.346812] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  129.347540] ata3.00: cmd a0/01:00:00:00:40/00:00:00:00:00/a0 tag 0 dma 16384 in
                        opcode=0x28 28 00 00 00 55 fa 00 00 08 00res 40/00:02:00:0c:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[  129.355906] ata3.00: status: { DRDY }
[  134.400076] ata3: link is slow to respond, please be patient (ready=0)
[  139.386826] ata3: device not ready (errno=-16), forcing hardreset
[  139.387505] ata3: soft resetting link
[  144.586725] ata3: link is slow to respond, please be patient (ready=0)
[  149.403407] ata3: SRST failed (errno=-16)
[  149.404249] ata3: soft resetting link
[  154.603401] ata3: link is slow to respond, please be patient (ready=0)
[  159.420725] ata3: SRST failed (errno=-16)
[  159.421558] ata3: soft resetting link
[  164.620145] ata3: link is slow to respond, please be patient (ready=0)
[  194.426716] ata3: SRST failed (errno=-16)
[  194.427477] ata3: soft resetting link
[  199.456775] ata3: SRST failed (errno=-16)
[  199.471749] ata3: reset failed, giving up
[  199.472636] ata3.00: disabled
[  199.474100] ata3: EH complete
[  199.474938] sr 2:0:0:0: [sr0] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
[  199.476392] sr 2:0:0:0: [sr0] tag#0 CDB: opcode=0x28 28 00 00 00 55 fa 00 00 08 00
[  199.491977] blk_update_request: I/O error, dev sr0, sector 88040
[  199.494604] sr 2:0:0:0: [sr0] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
[  199.496175] sr 2:0:0:0: [sr0] tag#0 CDB: opcode=0x28 28 00 00 00 55 fa 00 00 02 00
[  199.497683] blk_update_request: I/O error, dev sr0, sector 88040
[  199.498588] Buffer I/O error on dev sr0, logical block 11005, async page read


way2:

# time dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=2
dd: error reading '/dev/sr0': Input/output error
1+0 records in
1+0 records out
512 bytes copied, 102.138 s, 0.0 kB/s

real	1m42.147s
user	0m0.007s
sys	0m0.000s

top:

AioMgr0-N (0 here) constant CPU usage of 19.6% -> 21.6% (100% would mean 1 core is fully used) ...even after dd finishes!

dmesg:

[  102.492843] systemd-logind[202]: New session c2 of user root.
[  102.494941] systemd[1]: Started Session c2 of user root.
[  139.586919] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  139.587909] ata3.00: cmd a0/01:00:00:00:40/00:00:00:00:00/a0 tag 0 dma 16384 in
                        opcode=0x28 28 00 00 00 55 fa 00 00 08 00res 40/00:02:00:0c:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[  139.592029] ata3.00: status: { DRDY }
[  144.633536] ata3: link is slow to respond, please be patient (ready=0)
[  149.620205] ata3: device not ready (errno=-16), forcing hardreset
[  149.621023] ata3: soft resetting link
[  154.820197] ata3: link is slow to respond, please be patient (ready=0)
[  159.636906] ata3: SRST failed (errno=-16)
[  159.637643] ata3: soft resetting link
[  164.836889] ata3: link is slow to respond, please be patient (ready=0)
[  169.653533] ata3: SRST failed (errno=-16)
[  169.654227] ata3: soft resetting link
[  174.854340] ata3: link is slow to respond, please be patient (ready=0)
[  204.660185] ata3: SRST failed (errno=-16)
[  204.660805] ata3: soft resetting link
[  209.690159] ata3: SRST failed (errno=-16)
[  209.703203] ata3: reset failed, giving up
[  209.704066] ata3.00: disabled
[  209.704786] ata3: EH complete
[  209.705448] sr 2:0:0:0: [sr0] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
[  209.719212] sr 2:0:0:0: [sr0] tag#0 CDB: opcode=0x28 28 00 00 00 55 fa 00 00 08 00
[  209.720520] blk_update_request: I/O error, dev sr0, sector 88040
[  209.722434] sr 2:0:0:0: [sr0] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
[  209.724036] sr 2:0:0:0: [sr0] tag#0 CDB: opcode=0x28 28 00 00 00 55 fa 00 00 02 00
[  209.725383] blk_update_request: I/O error, dev sr0, sector 88040
[  209.726154] Buffer I/O error on dev sr0, logical block 11005, async page read


my host is Lenovo Z575, kernel: Linux z5 4.11.0-rc7-g4f7d029b9bf0 #151 SMP PREEMPT Wed Apr 19 18:36:02 CEST 2017 x86_64 GNU/Linux

virtualbox 5.1.18 r114002 on Arch Linux 64bit

Potentially relevant tickets: #15281, #13022

Attachments (10)

arch3-2017-04-20-22-24-23.log (54.2 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
When booted from iso(/dev/sr0) and hanged HDD(/dev/sda) with dd 42215 count=2 (see step 8)
arch3-2017-04-20-23-14-44.log (59.8 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
same log as before, but after 50 more minutes and after killing the hanged AioMgr0-N thread
arch3_1-2017-04-21-00-18-30.log (56.5 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
hung AioMgr1-N with the IDE/iso steps
arch3_1-2017-04-21-00-51-21.log (56.0 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
hung with -a16 for a 2 byte dd of /dev/sr0
arch3_1-2017-04-21-01-11-40.log (62.0 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
powereoff guest(didn't finish) after the above without killing AioMgr1-N, new status lines in log; no new lines after killing it
le.patch (2.0 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
The .patch to show the output from comment#11 in VBox.log
le.2.patch (7.5 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
does not hang AioMgr1-N anymore(see line with comment 'fix?')
pdm_async_completion_incomplete_req.patch (3.3 KB ) - added by aeichner 7 years ago.
arch3_1-2017-04-22-09-23-31.log (83.8 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
one VM's log after the perfection patch
arch3-2017-04-22-09-23-31.log (83.8 KB ) - added by deletedgmailduetoJamesDamore_firing 7 years ago.
another VM's log after the perfection patch

Download all attachments as: .zip

Change History (30)

by deletedgmailduetoJamesDamore_firing, 7 years ago

When booted from iso(/dev/sr0) and hanged HDD(/dev/sda) with dd 42215 count=2 (see step 8)

by deletedgmailduetoJamesDamore_firing, 7 years ago

same log as before, but after 50 more minutes and after killing the hanged AioMgr0-N thread

comment:1 by deletedgmailduetoJamesDamore_firing, 7 years ago

Briefly tried to hang AioMgr1-N (with the IDE/iso steps from above in 'next section', boot session 2) with the just-released(in ArchLinux testing repo, I mean) VirtualBox 5.1.20 r114628 (but I did recompile it myself though)

[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=2

dd: error reading '/dev/sr0': Input/output error
1+0 records in
1+0 records out
512 bytes copied, 102.141 s, 0.0 kB/s

dmesg:

[   47.309063] mandb[208]: Updating index cache for path `/usr/share/man/man3'. Wait...
[  110.573626] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  110.574843] ata2.00: cmd a0/01:00:00:00:40/00:00:00:00:00/a0 tag 0 dma 16384 in
                        opcode=0x28 28 00 00 00 55 fa 00 00 08 00res 40/00:02:00:0c:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[  110.586192] ata2.00: status: { DRDY }
[  115.633487] ata2: link is slow to respond, please be patient (ready=0)
[  120.623465] ata2: device not ready (errno=-16), forcing hardreset
[  120.624545] ata2: soft resetting link
[  125.823550] ata2: link is slow to respond, please be patient (ready=0)
[  130.640159] ata2: SRST failed (errno=-16)
[  130.641283] ata2: soft resetting link
[  135.840245] ata2: link is slow to respond, please be patient (ready=0)
[  140.690183] ata2: SRST failed (errno=-16)
[  140.691178] ata2: soft resetting link
[  145.890241] ata2: link is slow to respond, please be patient (ready=0)
[  165.753379] mandb[208]: /usr/bin/mandb: warning: /usr/share/man/man3/gnutls_x509_trust_list_add_system_trust.3.gz: whatis parse for gnutls_x509_trust_list_add_system_trust(3) failed
[  175.697630] ata2: SRST failed (errno=-16)
[  175.698734] ata2: soft resetting link
[  180.727614] ata2: SRST failed (errno=-16)
[  180.763306] ata2: reset failed, giving up
[  180.766737] ata2.00: disabled
[  180.767882] ata2: EH complete
[  180.768853] sr 1:0:0:0: [sr0] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
[  180.771402] sr 1:0:0:0: [sr0] tag#0 CDB: opcode=0x28 28 00 00 00 55 fa 00 00 08 00
[  180.773281] blk_update_request: I/O error, dev sr0, sector 88040
[  180.784906] sr 1:0:0:0: [sr0] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
[  180.786796] sr 1:0:0:0: [sr0] tag#0 CDB: opcode=0x28 28 00 00 00 55 fa 00 00 02 00
[  180.788529] blk_update_request: I/O error, dev sr0, sector 88040
[  180.789697] Buffer I/O error on dev sr0, logical block 11005, async page read

Killing the hung AioMgr1-N thread yielded no new lines in .log file, tried Refresh. Attaching log next...

by deletedgmailduetoJamesDamore_firing, 7 years ago

hung AioMgr1-N with the IDE/iso steps

comment:2 by deletedgmailduetoJamesDamore_firing, 7 years ago

Found more equivalents for this line:

dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=2

as the following:

dd if=/dev/sr0 of=/dev/null bs=1 skip=45075968 count=513
dd if=/dev/sr0 of=/dev/null bs=1 skip=45076479 count=2

no hang with:

dd if=/dev/sr0 of=/dev/null bs=1 skip=45076479 count=1

The above are with the default:

hdparm -a256 /dev/sr0

Here's one with -a16 (actually just tested, the following works the same with -a256 too! but the above do not work with -a16)

no hang with:

[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=1 skip=45084671 count=1
1+0 records in
1+0 records out
1 byte copied, 0.00255341 s, 0.4 kB/s
[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=1 skip=45084672 count=1
1+0 records in
1+0 records out
1 byte copied, 0.00293639 s, 0.3 kB/s
[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=1 skip=45084672 count=2
2+0 records in
2+0 records out
2 bytes copied, 0.00257371 s, 0.8 kB/s

hang with:

[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=1 skip=45084671 count=2
dd: error reading '/dev/sr0': Input/output error
1+0 records in
1+0 records out
1 byte copied, 101.377 s, 0.0 kB/s

hangs AioMgr1-N with 99.9% cpu usage! even after dd finished

killed the AioMgr1-N PID via 'kill number' but no new lines appeared in log(unlike the second .log I've posted in this thread, the one where I kill it after 50mins)

Attaching Vbox.log next.

by deletedgmailduetoJamesDamore_firing, 7 years ago

hung with -a16 for a 2 byte dd of /dev/sr0

comment:3 by deletedgmailduetoJamesDamore_firing, 7 years ago

btw, the above 'no hang with' and 'hang with' commands are all in the same boot session, which makes me wonder how come guest linux didn't cache the bytes/readahead-sectors(?) from the previous commands, so that the last command would then not fail. But, most likely I don't know how caching works :) Maybe it only happens for the file system...

The guest OS kernel is/was: Linux vm 4.10.9-1-ARCH #1 SMP PREEMPT Sat Apr 8 12:39:59 CEST 2017 x86_64 GNU/Linux

Updated (pacman -Syu) and kernel is now: Linux vm 4.10.10-1-ARCH #1 SMP PREEMPT Wed Apr 12 18:50:28 CEST 2017 x86_64 GNU/Linux

Can still reproduce the previous comment:

[root@vm ~]# uname -a
Linux vm 4.10.10-1-ARCH #1 SMP PREEMPT Wed Apr 12 18:50:28 CEST 2017 x86_64 GNU/Linux
[root@vm ~]# hdparm -a16 /dev/sr0

/dev/sr0:
 setting fs readahead to 16
 readahead     = 16 (on)
[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=1 skip=45084671 count=1
1+0 records in
1+0 records out
1 byte copied, 0.00228378 s, 0.4 kB/s
[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=1 skip=45084672 count=1
1+0 records in
1+0 records out
1 byte copied, 0.00230094 s, 0.4 kB/s
[root@vm ~]# dd if=/dev/sr0 of=/dev/null bs=1 skip=45084672 count=2
2+0 records in
2+0 records out
2 bytes copied, 0.00265822 s, 0.8 kB/s
[root@vm ~]# time dd if=/dev/sr0 of=/dev/null bs=1 skip=45084671 count=2
dd: error reading '/dev/sr0': Input/output error
1+0 records in
1+0 records out
1 byte copied, 101.113 s, 0.0 kB/s

real	1m41.123s
user	0m0.003s
sys	0m0.003s
[root@vm ~]# 

by deletedgmailduetoJamesDamore_firing, 7 years ago

powereoff guest(didn't finish) after the above without killing AioMgr1-N, new status lines in log; no new lines after killing it

comment:4 by deletedgmailduetoJamesDamore_firing, 7 years ago

I've just discovered that the reason why cloning the vdi didn't trigger this issue anymore is because I kept the cloned vdi in tmpfs (/tmp). By moving it back from /tmp to SSD and removing it from Virtual Media Manager(release then remove), then adding it again to the same VM then, only now 8 hours later, tried booting from 'existing OS' AioMgr0-N hangs again during boot. Then, I copied it back to /tmp (tmpfs) and removed from Virtual Media Manager(release then remove) then added the /tmp one to the same VM => no hang! So this makes me think that the location/storage of the vdi is what's making AioMgr hang. There are no bad sectors on my SSD, though.

I've also just tested exporting the VM, then importing it; this creates a .vmdk disk file, stored on my SSD(on host) and booting from 'existing OS' does not cause the AioMgr hang. I then removed the vmdk from the VM via UI, cloned it via VBoxManage clonehd --format VDI arch3_1-disk002.vmdk /tmp/arch3_1-disk002.vdi and then added the clone to the VM, no hang because vdi is in tmpfs; then removed it from Virtual Media Manager(release and remove), moved the vdi on the SSD, added it to the VM, boot from existing OS => bam, hang!

So this tells me, vdi (but not vmdk) location or storage is a major factor to hanging AioMgr: hangs if vdi is on my SSD but not if it's on /tmp tmpfs.

Last edited 7 years ago by deletedgmailduetoJamesDamore_firing (previous) (diff)

comment:5 by deletedgmailduetoJamesDamore_firing, 7 years ago

The above is true for the iso hangs too: the iso was on my SSD until now and just tried to hang it to make sure it still hangs and it does (via dd if=/dev/sr0 of=/dev/null bs=1 skip=45084671 count=2); then copied the iso to /tmp (tmpfs) then removed the old iso from the CD drive of the VM, browsed for the one in /tmp and added it to the VM as CD, booted 'existing OS'(that doesn't hang) then tried to hang with the same dd, no hang, tried a full-on dd (via dd if=/dev/sr0 of=/dev/null) and no hang!

So, the OP with reproduction is then true only if the vdi and/or iso (depending on which you're trying to hang) are located on my host's SSD, but not when they are located in /tmp (tmpfs)

Last edited 7 years ago by deletedgmailduetoJamesDamore_firing (previous) (diff)

comment:6 by deletedgmailduetoJamesDamore_firing, 7 years ago

Setting queue depth from here https://www.virtualbox.org/ticket/13022#comment:12 has no effect on this issue and I confirmed it by cat /sys/block/sda/device/queue_depth returning '1' (even though I only used the VBoxManage command to set it: VBoxManage setextradata arch3 "VBoxInternal/Devices/ahci/0/Config/CmdSlotsAvail" 1) and then was able to hang AioMgr with the dd line from step 8 boot session 3 from OP: dd if=/dev/sda of=/dev/null bs=512 skip=42215 count=2 .

comment:7 by Frank Mehnert, 7 years ago

Description: modified (diff)
Summary: A reliable way to hang AioMgr0-N when 'fs readahead'(hdparm -a) is on (and is 256(KiB)) but only when 'Use Host I/O Cache' is disabled, for IDE(iso) and SATA(vdi) devicesA reliable way to hang AioMgr0-N under certain conditions

comment:8 by Frank Mehnert, 7 years ago

Description: modified (diff)

comment:9 by Frank Mehnert, 7 years ago

Description: modified (diff)

comment:10 by deletedgmailduetoJamesDamore_firing, 7 years ago

Figured out how to attach gdb to the hung PID of AioMgr1-N, when hanging /dev/sr0, looks like this:

$ sudo gdb -p 7577
[sudo] password for xftroxgpx: 
GNU gdb (GDB) 7.12.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 7577
Reading symbols from /usr/lib/virtualbox/VirtualBox...done.
Reading symbols from /usr/lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Reading symbols from /usr/lib/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/VBoxRT.so...done.
Reading symbols from /usr/lib/libcrypt.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/librt.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxml2.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libcurl.so.4...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libssl.so.1.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libcrypto.so.1.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libstdc++.so.6...done.
Reading symbols from /usr/lib/libgcc_s.so.1...done.
Reading symbols from /usr/lib/liblzma.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libm.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libssh2.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpsl.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libkrb5.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libk5crypto.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libcom_err.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libicuuc.so.58...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libicudata.so.58...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libkrb5support.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libkeyutils.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libresolv.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/VirtualBox.so...done.
Reading symbols from /usr/lib/libXinerama.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libGL.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/VBoxKeyboard.so...done.
Reading symbols from /usr/lib/libXext.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libX11.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/VBoxXPCOM.so...done.
Reading symbols from /usr/lib/libQt5X11Extras.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libQt5Core.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libQt5Gui.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libQt5Widgets.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libQt5PrintSupport.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libQt5OpenGL.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libXau.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libXdmcp.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libGLX.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libGLdispatch.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libicui18n.so.58...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libsystemd.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpcre16.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libglib-2.0.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpng16.so.16...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libharfbuzz.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libcap.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/liblz4.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libgcrypt.so.20...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libgpg-error.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpcre.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libfreetype.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libgraphite2.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libbz2.so.1.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/qt/plugins/platforms/libqxcb.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libQt5XcbQpa.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-xinerama.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libQt5DBus.so.5...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libX11-xcb.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libXi.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libSM.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libICE.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-xkb.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-render-util.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-render.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-sync.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-xfixes.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-randr.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-image.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-shm.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-keysyms.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-icccm.so.4...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-shape.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxkbcommon-x11.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxkbcommon.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libfontconfig.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libdbus-1.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libuuid.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-util.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libexpat.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libXcursor.so.1...done.
Reading symbols from /usr/lib/libXrender.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libXfixes.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/lib/qt/plugins/xcbglintegrations/libqxcb-glx-integration.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-glx.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/qt/plugins/platforminputcontexts/libcomposeplatforminputcontextplugin.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libGLX_mesa.so.0...done.
Reading symbols from /usr/lib/libxcb-dri3.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-present.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxshmfence.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libglapi.so.0...done.
Reading symbols from /usr/lib/libXdamage.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libxcb-dri2.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libXxf86vm.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libdrm.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/xorg/modules/dri/r600_dri.so...done.
Reading symbols from /usr/lib/libdrm_nouveau.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libdrm_radeon.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libdrm_amdgpu.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libelf.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libLLVM-3.9.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/../lib/libffi.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/../lib/libedit.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/../lib/libncursesw.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libtxc_dxtn.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxXPCOMIPCC.so...done.
Reading symbols from /usr/lib/libnss_compat.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libnsl.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libnss_nis.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libnss_files.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxDDU.so...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxREM.so...done.
Reading symbols from /usr/lib/virtualbox/VBoxVMM.so...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxRT.so...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxVMM.so...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxXPCOM.so...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxC.so...done.
Reading symbols from /usr/lib/libvpx.so.4...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/components/VBoxSVCM.so...done.
Reading symbols from /usr/lib/qt/plugins/imageformats/libqgif.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/qt/plugins/imageformats/libqico.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/qt/plugins/imageformats/libqjpeg.so...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libjpeg.so.8...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/VBoxSharedClipboard.so...done.
Reading symbols from /usr/lib/libXt.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/virtualbox/VBoxDragAndDropSvc.so...done.
Reading symbols from /usr/lib/virtualbox/VBoxGuestPropSvc.so...done.
Reading symbols from /usr/lib/virtualbox/VBoxGuestControlSvc.so...done.
Reading symbols from /usr/lib/virtualbox/VBoxDD.so...done.
Reading symbols from /usr/lib/virtualbox/VBoxDD2.so...done.
Reading symbols from /usr/lib/virtualbox/VBoxSharedFolders.so...done.
0x00007f71d3414889 in syscall () from /usr/lib/libc.so.6
(gdb) set pagination off
(gdb) thread apply all bt full

Thread 1 (Thread 0x7f71709fa700 (LWP 7577)):
#0  0x00007f71d3414889 in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f71d304e147 in rtFileAsyncIoLinuxSubmit (pcSubmitted=<synthetic pointer>, ppIoCB=0x7f71709f9d60, cReqs=1, AioContext=<optimized out>) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/r3/linux/fileaio-linux.cpp:261
        rc = <optimized out>
#2  RTFileAioCtxSubmit (hAioCtx=0x7f71742fd160, pahReqs=pahReqs@entry=0x7f71709f9d60, cReqs=cReqs@entry=1) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/r3/linux/fileaio-linux.cpp:607
        cReqsSubmitted = 0
        rc = <optimized out>
        pCtxInt = 0x7f71742fd160
        i = <optimized out>
        pReqInt = <optimized out>
#3  0x00007f71b64485c0 in pdmacFileAioMgrNormalReqsEnqueue (cReqs=1, pahReqs=0x7f71709f9d60, pEndpoint=0x7f7174300830, pAioMgr=0x7f7174300c40) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:557
        rc = <optimized out>
#4  pdmacFileAioMgrNormalReqCompleteRc (cbTransfered=4096, rcReq=0, hReq=<optimized out>, pAioMgr=0x7f7174300c40) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:1509
        offStart = <optimized out>
        cbToTransfer = <optimized out>
        pbBuf = <optimized out>
        rc = <optimized out>
        pEndpoint = <optimized out>
        pTask = <optimized out>
#5  pdmacFileAioMgrNormalReqComplete (hReq=<optimized out>, pAioMgr=0x7f7174300c40) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:1342
        cbTransfered = 4096
        rcReq = 0
#6  pdmacFileAioMgrNormal (hThreadSelf=<optimized out>, pvUser=0x7f7174300c40) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:1674
        i = 0
        apReqs = {0x7f71480008c0, 0xffffffff, 0x7f71709f9e00, 0x1ba1250, 0x7f71709fa700, 0x7f7174301180, 0x7f71709f9e00, 0x7f71d304ae3c <RTSemRWRequestWriteNoResume(RTSEMRW, RTMSINTERVAL)+156>, 0x300000000, 0x396b675ed5f9600, 0x7f71709fa700, 0x1ba1250, 0x7f71709f9e00, 0x7f71d304b1ce <RTSemRWReleaseWrite(RTSEMRW)+142>, 0x74301a60, 0xffffffffffffffff, 0x7f71a4909348, 0x7f7174301180, 0x7f71709f9e40, 0x7f71d2fb349f <rtThreadInsert(PRTTHREADINT, RTNATIVETHREAD)+223>}
        cReqsCompleted = 1
        cReqsWait = <optimized out>
        uMillisCurr = <optimized out>
        rc = <optimized out>
        pAioMgr = 0x7f7174300c40
        uMillisEnd = <optimized out>
        __PRETTY_FUNCTION__ = "int pdmacFileAioMgrNormal(RTTHREAD, void*)"
#7  0x00007f71d2fb40f0 in rtThreadMain (pThread=0x7f7174301180, NativeThread=<optimized out>, pszThreadName=<optimized out>) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/common/misc/thread.cpp:717
        rc = <optimized out>
#8  0x00007f71d304b94b in rtThreadNativeMain (pvArgs=0x7f7174301180) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/r3/posix/thread-posix.cpp:327
        pThread = 0x7f7174301180
        Self = 140125197543168
        rc = <optimized out>
        __PRETTY_FUNCTION__ = "void* rtThreadNativeMain(void*)"
#9  0x00007f71d38dc2e7 in start_thread () from /usr/lib/libpthread.so.0
No symbol table info available.
#10 0x00007f71d341954f in clone () from /usr/lib/libc.so.6
No symbol table info available.
(gdb) 
$ pacman -Qo /usr/lib/libc.so.6
/usr/lib/libc.so.6 is owned by glibc 2.25-1
(gdb) continue
Continuing.

After a few seconds of AioMgr1-N still hanging using CPU, I then did a: kill -2 7577 from another terminal.

Program received signal SIGINT, Interrupt.
0x00007f71d3414889 in syscall () from /usr/lib/libc.so.6
(gdb) thread apply all bt full

Thread 1 (Thread 0x7f71709fa700 (LWP 7577)):
#0  0x00007f71d3414889 in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f71d304e44d in rtFileAsyncIoLinuxGetEvents (pTimeout=0x0, paIoResults=0x7f71709f94c0, cReqs=1, cReqsMin=1, AioContext=<optimized out>) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/r3/linux/fileaio-linux.cpp:289
        rc = <optimized out>
#2  RTFileAioCtxWait (hAioCtx=0x7f71742fd160, cMinReqs=cMinReqs@entry=1, cMillies=<optimized out>, cMillies@entry=4294967295, pahReqs=pahReqs@entry=0x7f71709f9d70, cReqs=1, pcReqs=pcReqs@entry=0x7f71709f9d50) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/r3/linux/fileaio-linux.cpp:709
        aPortEvents = {{pvUser = 0x7f7144000950, pIoCB = 0x7f71480008c0, rc = 4096, rc2 = 0}, {pvUser = 0x0, pIoCB = 0x0, rc = 0, rc2 = 0} <repeats 21 times>, {pvUser = 0x0, pIoCB = 0x0, rc = 0, rc2 = 140126859542208}, {pvUser = 0x0, pIoCB = 0x7f0, rc = 28848304, rc2 = 140126846543672}, {pvUser = 0x7f71d2eacb48, pIoCB = 0x7f71d3afc7d3 <do_lookup_x+2019>, rc = 2032, rc2 = 140126846634824}, {pvUser = 0x1b830b0, pIoCB = 0x7f71709f9878, rc = 140125197539444, rc2 = 401}, {pvUser = 0x7f71b625f4b2, pIoCB = 0x7f71b624d770, rc = 140125197539448, rc2 = 3134804711}, {pvUser = 0x2eb654b, pIoCB = 0x7f71d2eacb48, rc = 140125197539664, rc2 = 140126846543672}, {pvUser = 0x7f71709f9874, pIoCB = 0x7f71709f9940, rc = 30333904, rc2 = 83}, {pvUser = 0x709f98c8, pIoCB = 0x68b9a337, rc = 27453068, rc2 = 0}, {pvUser = 0x7f71709f99a0, pIoCB = 0x7f7180002070, rc = 140125197539808, rc2 = 0}, {pvUser = 0x5, pIoCB = 0x0, rc = 140125197539848, rc2 = 140126859545825}, {pvUser = 0x5, pIoCB = 0x0, rc = 5, rc2 = 0}, {pvUser = 0x1, pIoCB = 0x1e8f570, rc = 5, rc2 = 0}, {pvUser = 0x0, pIoCB = 0x7f7180002070, rc = 140125197539664, rc2 = 4294967296}, {pvUser = 0x1e8f570, pIoCB = 0x7f71709f9940, rc = 140126363972786, rc2 = 140125455523472}, {pvUser = 0xffffffff, pIoCB = 0x7f71d3afc7d3 <do_lookup_x+2019>, rc = 140126846592440, rc2 = 140126859542208}, {pvUser = 0x7f71709f99a0, pIoCB = 0x216, rc = 28848304, rc2 = 140126846543672}, {pvUser = 0x7f71d2eacb48, pIoCB = 0x7f71d3afc7d3 <do_lookup_x+2019>, rc = 534, rc2 = 140126846634824}, {pvUser = 0x1b830b0, pIoCB = 0x7f71709f9a38, rc = 140125197539892, rc2 = 140126369477656}, {pvUser = 0x7f71b625f46e, pIoCB = 0x7f71b624b100, rc = 140125197539896, rc2 = 1236669803}, {pvUser = 0x126d855, pIoCB = 0x7f71d2eacb48, rc = 140125197540112, rc2 = 140126846543672}, {pvUser = 0x7f71709f9a34, pIoCB = 0x7f71709f9b00, rc = 30333904, rc2 = 83}, {pvUser = 0x0, pIoCB = 0x0, rc = 0, rc2 = 0}, {pvUser = 0x0, pIoCB = 0x7f7180002070, rc = 140125197540256, rc2 = 0}, {pvUser = 0x5, pIoCB = 0x0, rc = 140125197540296, rc2 = 140126859545825}, {pvUser = 0x5, pIoCB = 0x0, rc = 5, rc2 = 0}, {pvUser = 0x1, pIoCB = 0x1e8f570, rc = 0, rc2 = -256}, {pvUser = 0x0, pIoCB = 0x7f7180002070, rc = 140125197540112, rc2 = 6907273216}, {pvUser = 0x1e8f570, pIoCB = 0x7f71709f9b00, rc = 140126363972718, rc2 = 140125455523472}, {pvUser = 0xffffffff, pIoCB = 0x7f71d304e147 <RTFileAioCtxSubmit(RTFILEAIOCTX, PRTFILEAIOREQ, size_t)+359>, rc = 140126846556488, rc2 = 140125197540336}, {pvUser = 0x7f7174300c40, pIoCB = 0x1, rc = 140125197540224, rc2 = 140126365969813}, {pvUser = 0x7f71709f9b80, pIoCB = 0x7f71d304da3f <RTFileAioReqPrepareRead(RTFILEAIOREQ, RTFILE, RTFOFF, void*, size_t, void*)+95>, rc = 140124515993792, rc2 = 0}, {pvUser = 0x7f7144000950, pIoCB = 0x7f7174300830, rc = 1, rc2 = 140125257337920}, {pvUser = 0x7f71709f9cc0, pIoCB = 0x7f71b64473a0 <pdmacFileAioMgrNormalProcessTaskList(PPDMACTASKFILE, PPDMACEPFILEMGR, PPDMASYNCCOMPLETIONENDPOINTFILE)+1088>, rc = 2048, rc2 = 140125257337920}, {pvUser = 0x7f71709f9e40, pIoCB = 0x0, rc = 140124515993792, rc2 = 140126857340673}, {pvUser = 0x7f71709f9be8, pIoCB = 0x4000, rc = 140125197540336, rc2 = 140125197540324}, {pvUser = 0x80, pIoCB = 0x7f71480008c0, rc = 140124515993792, rc2 = 0}, {pvUser = 0x7f7174300d40, pIoCB = 0x0, rc = 140126857331153, rc2 = 140125197540400}, {pvUser = 0x0, pIoCB = 0x0, rc = 152, rc2 = 140125257338128}, {pvUser = 0x7f7174300d40, pIoCB = 0x7f7100000000, rc = 74, rc2 = 258594646230275584}, {pvUser = 0x4000206e0, pIoCB = 0x7f7174300d10, rc = 140125197540592, rc2 = 140125257339264}, {pvUser = 0x7f7174300d40, pIoCB = 0x7f71709f9ca8, rc = 140125197540518, rc2 = 0}, {pvUser = 0x7f7174300c40, pIoCB = 0x0, rc = 140125197540592, rc2 = 1}}
        cRequestsToWait = 1
        cDone = <optimized out>
        pCtxInt = 0x7f71742fd160
        pTimeout = 0x0
        Timeout = {tv_sec = 0, tv_nsec = 0}
        StartNanoTS = 0
        rc = 0
        cRequestsCompleted = 0
#3  0x00007f71b6448240 in pdmacFileAioMgrNormal (hThreadSelf=<optimized out>, pvUser=0x7f7174300c40) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/VMM/VMMR3/PDMAsyncCompletionFileNormal.cpp:1664
        apReqs = {0x7f71480008c0, 0xffffffff, 0x7f71709f9e00, 0x1ba1250, 0x7f71709fa700, 0x7f7174301180, 0x7f71709f9e00, 0x7f71d304ae3c <RTSemRWRequestWriteNoResume(RTSEMRW, RTMSINTERVAL)+156>, 0x300000000, 0x396b675ed5f9600, 0x7f71709fa700, 0x1ba1250, 0x7f71709f9e00, 0x7f71d304b1ce <RTSemRWReleaseWrite(RTSEMRW)+142>, 0x74301a60, 0xffffffffffffffff, 0x7f71a4909348, 0x7f7174301180, 0x7f71709f9e40, 0x7f71d2fb349f <rtThreadInsert(PRTTHREADINT, RTNATIVETHREAD)+223>}
        cReqsCompleted = 0
        cReqsWait = <optimized out>
        uMillisCurr = <optimized out>
        rc = <optimized out>
        pAioMgr = 0x7f7174300c40
        uMillisEnd = 14936863
        __PRETTY_FUNCTION__ = "int pdmacFileAioMgrNormal(RTTHREAD, void*)"
#4  0x00007f71d2fb40f0 in rtThreadMain (pThread=0x7f7174301180, NativeThread=<optimized out>, pszThreadName=<optimized out>) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/common/misc/thread.cpp:717
        rc = <optimized out>
#5  0x00007f71d304b94b in rtThreadNativeMain (pvArgs=0x7f7174301180) at /home/xftroxgpx/build/1packages/virtualbox/makepkg_pacman/virtualbox/src/VirtualBox-5.1.20/src/VBox/Runtime/r3/posix/thread-posix.cpp:327
        pThread = 0x7f7174301180
        Self = 140125197543168
        rc = <optimized out>
        __PRETTY_FUNCTION__ = "void* rtThreadNativeMain(void*)"
#6  0x00007f71d38dc2e7 in start_thread () from /usr/lib/libpthread.so.0
No symbol table info available.
#7  0x00007f71d341954f in clone () from /usr/lib/libc.so.6
No symbol table info available.
(gdb) 

Not sure how helpful this is...

Last edited 7 years ago by deletedgmailduetoJamesDamore_firing (previous) (diff)

comment:11 by deletedgmailduetoJamesDamore_firing, 7 years ago

So far, found out it's stuck in this 'while':

00:00:03.378654 inside pdmacFileAioMgrNormal
// ^ this only happens once
...<snip>...
00:00:55.731723 before while
00:00:55.731731 Waiting for 1 of 1 tasks to complete
00:00:55.732040 1 tasks completed
// ^ those 2 lines repeat
...<snip>...
00:01:01.936480 Waiting for 1 of 1 tasks to complete
00:01:01.936723 1 tasks completed
00:01:01.936728 32768 messages from group PDM_ASYNC_COMPLETION (#339), muting it.

The .patch to show the above output in VBox.log will be added next.

VirtualBox was started like this:

export VBOX_LOG="-all+dev_ide.e.l.f+rt_file.e.l.f+pdm_async_completion.e.l.f"
export VBOX_RELEASE_LOG="-all+dev_ide.e.l.f+rt_file.e.l.f+pdm_async_completion.e.l.f"
/usr/bin/VirtualBox

Frankly, I don't know how to show those LogFlow() so I replaced them with LogRel() :D

More to be found, stay tuned xD

by deletedgmailduetoJamesDamore_firing, 7 years ago

Attachment: le.patch added

The .patch to show the output from comment#11 in VBox.log

comment:12 by deletedgmailduetoJamesDamore_firing, 7 years ago

So far, the hang happens because 'Restarting incomplete transfer' which in turn happens because 12288 or '4096 bytes transferred)' is less than 'pTask->DataSeg.cbSeg=16384'. Why is cbSeg 16384? no idea, yet.

00:01:11.964411 before while 1
00:01:11.964421 0)reqsactive=1
00:01:11.964426 Waiting for 1 of 1 tasks to complete
00:01:11.964744 1 tasks completed
00:01:11.964791 1)reqsactive=1
00:01:11.964797 1_1)cbTransfered=4096 rcReq=0 hReq=00007f17ec0008c0
00:01:11.964805 pAioMgr=0x007f181c413a40 hReq=0x007f17ec0008c0 failure=0
00:01:11.964818 Task=0x007f17e80009b0 completed with VINF_SUCCESS
00:01:11.964884 2)reqsactive=0
00:01:11.964889 3)reqsactive=0
00:01:11.964894 4)reqsactive=0
00:01:11.964900 5)reqsactive=0
00:01:11.964905 after while
00:01:11.966153 before while 1
00:01:11.966164 0)reqsactive=1
00:01:11.966169 Waiting for 1 of 1 tasks to complete
00:01:11.966640 1 tasks completed
00:01:11.966674 1)reqsactive=1
00:01:11.966695 1_1)cbTransfered=12288 rcReq=0 hReq=00007f17ec0008c0
00:01:11.966716 pAioMgr=0x007f181c413a40 hReq=0x007f17ec0008c0 failure=0
00:01:11.966726 Restarting incomplete transfer 0x007f17e80009b0 (12288 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:11.966770 2)reqsactive=1
00:01:11.966776 3)reqsactive=1
00:01:11.966781 4)reqsactive=1
00:01:11.966787 5)reqsactive=1
00:01:11.966792 0)reqsactive=1
00:01:11.966796 Waiting for 1 of 1 tasks to complete
00:01:11.966803 1 tasks completed
00:01:11.966807 1)reqsactive=1
00:01:11.966812 1_1)cbTransfered=4096 rcReq=0 hReq=00007f17ec0008c0
00:01:11.966819 pAioMgr=0x007f181c413a40 hReq=0x007f17ec0008c0 failure=0
00:01:11.966825 Restarting incomplete transfer 0x007f17e80009b0 (4096 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:11.966886 2)reqsactive=1
00:01:11.966897 3)reqsactive=1
00:01:11.966903 4)reqsactive=1
00:01:11.966907 5)reqsactive=1
00:01:11.966912 0)reqsactive=1
00:01:11.966917 Waiting for 1 of 1 tasks to complete
00:01:11.967295 1 tasks completed
00:01:11.967316 1)reqsactive=1
00:01:11.967333 1_1)cbTransfered=8192 rcReq=0 hReq=00007f17ec0008c0
00:01:11.967355 pAioMgr=0x007f181c413a40 hReq=0x007f17ec0008c0 failure=0
00:01:11.967380 Restarting incomplete transfer 0x007f17e80009b0 (8192 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:11.967457 2)reqsactive=1
00:01:11.967562 3)reqsactive=1
00:01:11.967567 4)reqsactive=1
00:01:11.967572 5)reqsactive=1
00:01:11.967576 0)reqsactive=1
00:01:11.967581 Waiting for 1 of 1 tasks to complete
00:01:11.967708 1 tasks completed

comment:13 by deletedgmailduetoJamesDamore_firing, 7 years ago

All I know is that pTask->DataSeg.cbSeg=16384 gets set to 16384 inside drvvdAsyncIOReadSync of file src/VBox/Devices/Storage/DrvVD.cpp on the line

DataSeg.cbSeg = cbRead;

But I've no idea how to dump a stacktrace(ok, I tried but stdout/stderr won't show anywhere), since setting a breakpoint with gdb causes the thread to terminate via

Program terminated with signal SIGTRAP, Trace/breakpoint trap.

So, I've no idea why it gets set to 16384, why cbRead would be that, but maybe I'm looking at it wrongly, and it's the cbTransfered=12288 (and later 4096) that should be 16384 ?

Last edited 7 years ago by deletedgmailduetoJamesDamore_firing (previous) (diff)

comment:14 by deletedgmailduetoJamesDamore_firing, 7 years ago

Ok, I think I get it: it tries to read 16384 bytes(aka pTask->DataSeg.cbSeg), but only reads 12288(aka cbTransfered) and then schedules the remaining(4096 bytes) to be read next, but without changing that pTask->DataSeg.cbSeg of 16384 to 4096 ... which is why it keeps restarting, and next it tries to read and gets only 8192 then 4096 and these two keep repeating...

00:01:12.547260 before while 1
00:01:12.547324 0)reqsactive=1
00:01:12.547329 Waiting for 1 of 1 tasks to complete
00:01:12.547768 1 tasks completed
00:01:12.547825 1)reqsactive=1
00:01:12.547831 1_1)cbTransfered=4096 rcReq=0 hReq=00007f93140008c0
00:01:12.547839 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=4096
00:01:12.547852 Task=0x007f93100009b0 completed with VINF_SUCCESS
00:01:12.547904 2)reqsactive=0
00:01:12.547909 3)reqsactive=0
00:01:12.547913 4)reqsactive=0
00:01:12.547919 5)reqsactive=0
00:01:12.547923 after while
00:01:12.548726 !!!!3! DataSeg.cbSeg = 16384
00:01:12.549003 before while 1
00:01:12.549014 0)reqsactive=1
00:01:12.549019 Waiting for 1 of 1 tasks to complete
00:01:12.549599 1 tasks completed
00:01:12.549650 1)reqsactive=1
00:01:12.549656 1_1)cbTransfered=12288 rcReq=0 hReq=00007f93140008c0
00:01:12.549664 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.549672 Restarting incomplete transfer 0x007f93100009b0 (12288 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.549680 Here2
00:01:12.549684 Here3
00:01:12.549721 2)reqsactive=1
00:01:12.549726 3)reqsactive=1
00:01:12.549731 4)reqsactive=1
00:01:12.549736 5)reqsactive=1
00:01:12.549741 0)reqsactive=1
00:01:12.549745 Waiting for 1 of 1 tasks to complete
00:01:12.549751 1 tasks completed
00:01:12.549756 1)reqsactive=1
00:01:12.549760 1_1)cbTransfered=4096 rcReq=0 hReq=00007f93140008c0
00:01:12.549766 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.549773 Restarting incomplete transfer 0x007f93100009b0 (4096 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.549779 Here2
00:01:12.549783 Here3
00:01:12.549849 2)reqsactive=1
00:01:12.549856 3)reqsactive=1
00:01:12.549860 4)reqsactive=1
00:01:12.549865 5)reqsactive=1
00:01:12.549869 0)reqsactive=1
00:01:12.549873 Waiting for 1 of 1 tasks to complete
00:01:12.550457 1 tasks completed
00:01:12.550525 1)reqsactive=1
00:01:12.550531 1_1)cbTransfered=8192 rcReq=0 hReq=00007f93140008c0
00:01:12.550539 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.550547 Restarting incomplete transfer 0x007f93100009b0 (8192 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.550554 Here2
00:01:12.550559 Here3
00:01:12.550646 2)reqsactive=1
00:01:12.550653 3)reqsactive=1
00:01:12.550658 4)reqsactive=1
00:01:12.550663 5)reqsactive=1
00:01:12.550668 0)reqsactive=1
00:01:12.550672 Waiting for 1 of 1 tasks to complete
00:01:12.551186 1 tasks completed
00:01:12.551238 1)reqsactive=1
00:01:12.551244 1_1)cbTransfered=4096 rcReq=0 hReq=00007f93140008c0
00:01:12.551252 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.551260 Restarting incomplete transfer 0x007f93100009b0 (4096 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.551267 Here2
00:01:12.551271 Here3
00:01:12.551470 2)reqsactive=1
00:01:12.551477 3)reqsactive=1
00:01:12.551482 4)reqsactive=1
00:01:12.551487 5)reqsactive=1
00:01:12.551492 0)reqsactive=1
00:01:12.551496 Waiting for 1 of 1 tasks to complete
00:01:12.551918 1 tasks completed
00:01:12.551965 1)reqsactive=1
00:01:12.551971 1_1)cbTransfered=8192 rcReq=0 hReq=00007f93140008c0
00:01:12.551979 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.551987 Restarting incomplete transfer 0x007f93100009b0 (8192 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.551994 Here2
00:01:12.551998 Here3
00:01:12.552084 2)reqsactive=1
00:01:12.552090 3)reqsactive=1
00:01:12.552095 4)reqsactive=1
00:01:12.552100 5)reqsactive=1
00:01:12.552105 0)reqsactive=1
00:01:12.552109 Waiting for 1 of 1 tasks to complete
00:01:12.552689 1 tasks completed
00:01:12.552737 1)reqsactive=1
00:01:12.552743 1_1)cbTransfered=4096 rcReq=0 hReq=00007f93140008c0
00:01:12.552751 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.552759 Restarting incomplete transfer 0x007f93100009b0 (4096 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.552781 Here2
00:01:12.552787 Here3
00:01:12.552875 2)reqsactive=1
00:01:12.552882 3)reqsactive=1
00:01:12.552887 4)reqsactive=1
00:01:12.552893 5)reqsactive=1
00:01:12.552897 0)reqsactive=1
00:01:12.552902 Waiting for 1 of 1 tasks to complete
00:01:12.553198 1 tasks completed
00:01:12.553233 1)reqsactive=1
00:01:12.553252 1_1)cbTransfered=8192 rcReq=0 hReq=00007f93140008c0
00:01:12.553348 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.553357 Restarting incomplete transfer 0x007f93100009b0 (8192 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.553364 Here2
00:01:12.553369 Here3
00:01:12.553457 2)reqsactive=1
00:01:12.553465 3)reqsactive=1
00:01:12.553470 4)reqsactive=1
00:01:12.553475 5)reqsactive=1
00:01:12.553480 0)reqsactive=1
00:01:12.553484 Waiting for 1 of 1 tasks to complete
00:01:12.553769 1 tasks completed
00:01:12.553774 1)reqsactive=1
00:01:12.553779 1_1)cbTransfered=4096 rcReq=0 hReq=00007f93140008c0
00:01:12.553785 pAioMgr=0x007f9344413a40 hReq=0x007f93140008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:01:12.553792 Restarting incomplete transfer 0x007f93100009b0 (4096 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:01:12.553798 Here2
00:01:12.553802 Here3
00:01:12.553827 2)reqsactive=1
00:01:12.553833 3)reqsactive=1
00:01:12.553837 4)reqsactive=1
00:01:12.553842 5)reqsactive=1
00:01:12.553846 0)reqsactive=1
00:01:12.553850 Waiting for 1 of 1 tasks to complete
00:01:12.554133 1 tasks completed

comment:15 by deletedgmailduetoJamesDamore_firing, 7 years ago

Alright well, I can make AioMgr not hang anymore(patch included next), but the transfer still fails the same inside guest OS.

The question is, why does the 16KiB read fail by reading only 12K and then the next 4K gets scheduled to be read, and well, what happens after that? is that properly merged together?(I don't even know what I'm doing here, heh).

Another thing is: why does it say 'Restarting transfer' if it continues it instead ?

00:00:49.794916 before while 1
00:00:49.794927 0)reqsactive=1
00:00:49.794932 Waiting for 1 of 1 tasks to complete
00:00:49.795475 1 tasks completed
00:00:49.795532 1)reqsactive=1
00:00:49.795538 1_1)cbTransfered=4096 rcReq=0 hReq=00007fb5540008c0
00:00:49.795546 pAioMgr=0x007fb580413a40 hReq=0x007fb5540008c0 failure=0 pTask->DataSeg.cbSeg=4096
00:00:49.795575 Task=0x007fb5500009b0 completed with VINF_SUCCESS
00:00:49.795673 2)reqsactive=0
00:00:49.795722 3)reqsactive=0
00:00:49.795736 4)reqsactive=0
00:00:49.795815 5)reqsactive=0
00:00:49.795847 after while
00:00:49.800052 !!!!3! DataSeg.cbSeg = 16384
00:00:49.800511 before while 1
00:00:49.800522 0)reqsactive=1
00:00:49.800527 Waiting for 1 of 1 tasks to complete
00:00:49.801160 1 tasks completed
00:00:49.801207 1)reqsactive=1
00:00:49.801215 1_1)cbTransfered=12288 rcReq=0 hReq=00007fb5540008c0
00:00:49.801225 pAioMgr=0x007fb580413a40 hReq=0x007fb5540008c0 failure=0 pTask->DataSeg.cbSeg=16384
00:00:49.801235 Restarting incomplete transfer 0x007fb5500009b0 (12288 bytes transferred) pTask->DataSeg.cbSeg=16384 pTask->cbBounceBuffer=0
00:00:49.801242 Here2
00:00:49.801247 Here3
00:00:49.801292 2)reqsactive=1
00:00:49.801298 3)reqsactive=1
00:00:49.801303 4)reqsactive=1
00:00:49.801309 5)reqsactive=1
00:00:49.801314 0)reqsactive=1
00:00:49.801320 Waiting for 1 of 1 tasks to complete
00:00:49.801328 1 tasks completed
00:00:49.801332 1)reqsactive=1
00:00:49.801337 1_1)cbTransfered=4096 rcReq=0 hReq=00007fb5540008c0
00:00:49.801343 pAioMgr=0x007fb580413a40 hReq=0x007fb5540008c0 failure=0 pTask->DataSeg.cbSeg=4096
00:00:49.801356 Task=0x007fb5500009b0 completed with VINF_SUCCESS
00:00:49.801385 2)reqsactive=0
00:00:49.801390 3)reqsactive=0
00:00:49.801394 4)reqsactive=0
00:00:49.801399 5)reqsactive=0
00:00:49.801404 after while
00:00:53.180166 !!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 49152!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 49152!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 12288!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 12288!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 69632!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 12288!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 16384!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 16384!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 32768!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 24576!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 24576!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 16384!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 12288PIIX3 ATA: Ctl#1: RESET, DevSel=0 AIOIf=0 CmdIf0=0xa0 (41170068 usec ago) CmdIf1=0x00 (-1 usec ago)
00:01:32.142770 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:33.276371 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:34.409715 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:35.542929 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:36.676460 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:37.809827 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:38.942610 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:40.075851 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:40.986261 PIIX3 ATA: Ctl#1: RESET, DevSel=0 AIOIf=0 CmdIf0=0xa0 (51189250 usec ago) CmdIf1=0x00 (-1 usec ago)
00:01:41.255947 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:42.389855 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:43.522580 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:44.656404 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:45.789990 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:46.922614 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:48.055937 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:49.189393 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:50.323394 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:51.006639 PIIX3 ATA: Ctl#1: RESET, DevSel=0 AIOIf=0 CmdIf0=0xa0 (61209628 usec ago) CmdIf1=0x00 (-1 usec ago)
00:01:51.503214 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:52.636391 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:53.769538 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:54.902948 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:56.036080 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:57.169764 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:58.303267 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:59.436424 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:01:59.737028 !!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 4096!!!!5! pAhciReq->u.Io.DataSeg.cbSeg = 8192PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:01.703318 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:02.836052 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:03.969964 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:05.103007 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:06.236351 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:07.369245 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:08.502770 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:09.635958 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:10.769690 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:11.903395 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:13.036394 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:14.169681 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:15.302546 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting
00:02:16.436566 PIIX3 ATA LUN#2: Async I/O thread probably stuck in operation, interrupting

# dd if=/dev/sr0 of=/dev/null bs=512 skip=88039 count=2
dd: error reading '/dev/sr0': Input/output error
1+0 records in
1+0 records out
512 bytes copied, 101.355 s, 0.0 kB/s

by deletedgmailduetoJamesDamore_firing, 7 years ago

Attachment: le.2.patch added

does not hang AioMgr1-N anymore(see line with comment 'fix?')

comment:16 by deletedgmailduetoJamesDamore_firing, 7 years ago

I'll leave it to the experts, they can probably figure this all out in 20 minutes or less :)

Cheers!

comment:17 by aeichner, 7 years ago

Thanks a lot for the very detailed report and patch you provided (doesn't happen very often to get such a detailed description from a reporter)! Your patch points in the right direction but is not complete as it covers only one case so far and doesn't take into account when a request might be restarted multiple times to complete the transfer. I attached a patch which should cover all cases, would be great if you could try it locally. We can then incorporate it into the next maintenance release. If you can't test the patch we can also provide you with a test build.

comment:18 by deletedgmailduetoJamesDamore_firing, 7 years ago

Hey, thanks so much @aeichner! Patch works!

I have tested the patch against two VMs(with vdi`s on my SSD) with the steps from OP and with dd-ing the entire devices(sr0 and sda) and everything is working as expected. No more hangs or errors. All good here!

Really glad this is no longer an issue. Much appreciated, @aeichner! All the best to you! Cheers.

PS: I might remove the hosted VM file downloads from OP at some point, since they are no longer needed.

Hooray! #mehappy :))

by deletedgmailduetoJamesDamore_firing, 7 years ago

one VM's log after the perfection patch

by deletedgmailduetoJamesDamore_firing, 7 years ago

another VM's log after the perfection patch

comment:19 by Frank Mehnert, 7 years ago

Resolution: fixed
Status: newclosed

Fix is part of the 5.1.22 release. Thanks again for the detailed analysis!

comment:20 by deletedgmailduetoJamesDamore_firing, 7 years ago

Thanks frank! also for prettifying my post(s). Cheers my friends!

Note: See TracTickets for help on using tickets.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette