#16795 new defect

TRIM on SSD not stable with Linux, Windows 10, FreeBSD

Reported by:	linuxguru	Owned by:
Component:	other	Version:	VirtualBox 5.1.22
Keywords:	TRIM SSD	Cc:
Guest type:	all	Host type:	Windows

Description (last modified by Frank Mehnert)

Linux TRIM timeouts sometimes, FreeBSD same, Windows 10 doesn't boot at all

See below and attachments for details:

Linux Fedora 25 latest updates

Attach an image:

VBoxManage storageattach "Fedora 25" --storagectl "SATA" --port 0 --device 0 --nonrotational on --discard on --medium "Fedora 25.vdi" --type hdd

check trim enabled

hdparm -I /dev/sda | grep TRIM
           *    Data Set Management TRIM supported (limit unknown)

Enable it in the mount options and reboot afterwards

/dev/mapper/fedora-root /                       ext4    defaults,discard        1 1

run fstrim /

Hangs sometimes with timeouts

FreeBSD 11 latest updates

Attach disk as under Linnux

Check for TRIM support

camcontrol identify /dev/ada0 | grep -i trim
Data Set Management (DSM/TRIM) yes

Enable TRIM support on filesystem

mount | grep -i ufs
/dev/ada0p2 on / (ufs, local, journaled soft-updates)

# Boot into single user mode
tunefs -t enable /dev/ada0p2
tunefs: issue TRIM to the disk set

bash -c 'tunefs -p /dev/ada0p2 2>&1 | grep -i trim'
tunefs: trim: (-t)                                         enabled

Attachments (3)

VirtualBox_Fedora 25_28_05_2017_08_44_23.png (13.1 KB ) - added by linuxguru 8 years ago.: Fedora 25 TRIM problem
VirtualBox_FreeBSD - UFS_27_05_2017_22_08_25.png (11.9 KB ) - added by linuxguru 8 years ago.: FreeBSD UFS Trim problem
VirtualBox_FreeBSD_27_05_2017_22_20_32.png (10.5 KB ) - added by linuxguru 8 years ago.: FreeBSD ZFS Trim problem

Download all attachments as: .zip

Change History (11)

by linuxguru, 8 years ago

Attachment:	VirtualBox_Fedora 25_28_05_2017_08_44_23.png added

Fedora 25 TRIM problem

by linuxguru, 8 years ago

Attachment:	VirtualBox_FreeBSD - UFS_27_05_2017_22_08_25.png added

FreeBSD UFS Trim problem

by linuxguru, 8 years ago

Attachment:	VirtualBox_FreeBSD_27_05_2017_22_20_32.png added

FreeBSD ZFS Trim problem

comment:1 by Frank Mehnert, 8 years ago

Description:	modified (diff)
priority:	blocker → major

comment:2 by linuxguru, 8 years ago

Any update on this issue? It is important for thin provisioning.

comment:3 by deAtog, 8 years ago

I believe this is related to #16450. I've reviewed the source code for the discard option and it appears that it has several issues. The current implementation for handling discards does the following in the following situations:

If the TRIM'd block clears a partial VDI data block, the area is filled with 0's

If the TRIM'd block clears an entire VDI data block:
1. The TRIM'd block in the VDI block header is marked as unallocated.
2. The last block in the VDI is read into memory and written to the TRIM'd block location.
3. The block pointing to the last block in the VDI file is updated with the new location.
4. The VDI is truncated to remove the last allocated block.

As you can see, in the 2nd case above, there is a lot of IO that happens whenever the guest OS TRIM's an entire VDI data block. All of the steps in that case do not take any precaution to ensure that they occur sequentially and uninterrupted. Any interruption results in an IO error reported to the guest OS, which may subsequently retry the operation. This further exacerbates the issue and is what I believe causes the issues seen here. It is my opinion that the implementation of the discard option needs to be completely rewritten.

I would attempt such a rewrite, but other projects are currently occupying my time. For any developers looking at this ticket, I would do the following to resolve this and improve the functionality of this option.

When a VDI is opened and this option is enabled do the following:
1. Create a list of all VDI data blocks that have been allocated (based on block size and file size).
2. Iterate over the VDI block header and remove all blocks from the list which are in use.
3. Convert the remaining list to a min heap, called the free-data heap.
- Note: in a 100% used VDI, this will result in an empty heap.

When a TRIM command is received:
1. If a partial VDI block is TRIM'd:
  1. Don't do anything. There's no requirement that free space must contain 0's.
2. If an entire VDI block is TRIM'd:
  1. Mark the block as unallocated in the VDI block header.
  2. Add the location to the free-data heap.

When a new block needs to be allocated:
1. If the free-block heap is NOT empty:
  1. Remove the minimum free data location from the free-data heap.
  2. Assign and update the location to the block being allocated.

If the free-block heap IS empty:
1. Enlarge the VDI by the VDI block size.
2. Assign the location of the space to the block being allocated.

When the VDI is closed:
1. Iterate over any remaining free locations in the free-data heap.
2. Move the data from the last data blocks in the VDI to the available free-data locations.
3. Update the VDI block header with the new data locations as data blocks are moved.
4. Truncate the VDI by the number of data blocks moved.

If the above is implemented, discards are reduced to a quick update of the VDI block header. Every freed, but allocated, data block also allows future block allocations to be much simpler. The consistency of the VDI is maintained by the fact the free-data heap is rebuilt when the VDI is opened ensuring any allocated, but free blocks are correctly added to the heap. In an ideal world, the initial free-data heap would always be empty.

comment:4 by SixEcho, 7 years ago

agree discard/trim needs serious attention and would be a really useful feature. (see SATA 3.1 Queued Trim)

testing on win8 vdi with a lot of non-trimmed space... run disk optimize causes a lot of trims to be issued, which seems to overwhelm discard making the disk unresponsive to the guest which eventually crashes/resets.

00:01:29.409587 AHCI#0: Port 0 reset
00:01:30.545526 VD#0: Discard request was active for 31 seconds
00:01:30.545569 VD#0: Cancelling all active requests
00:02:00.550698 AHCI#0: Port 0 reset
00:02:03.724005 GIM: HyperV: Guest indicates a fatal condition! P0=0x7a P1=0xc68e28 P2=0xc000000e P3=0x19b8e880 P4=0x8d1c5670
00:02:06.691342 VMMDev: vmmDevHeartbeatFlatlinedTimer: Guest seems to be unresponsive. Last heartbeat received 4 seconds ago
00:02:07.302353 GIM: HyperV: Reset initiated through MSR
00:02:07.302423 Changing the VM state from 'RUNNING' to 'RESETTING'

comment:5 by oddsocks, 6 years ago

FWIW this still seems to occur with VB 6.0.0

comment:6 by facboy, 5 years ago

As a point of reference, this is still occurring on VB 6.1.4, Windows 10 host, CentOS 8 guest.

comment:7 by fth0, 4 years ago

FWIW, I've been using the TRIM/discard functionality successfully for over a year now with VirtualBox 6.1.x versions. The key to prevent the lockups was to enable the Host I/O Cache for the SATA Controller.

For example, I'm using it together with the monthly Windows updates, where TRIMming reduces the VDI file from (for example) 20 GB to 16 GB without lockups.

comment:8 by beer, 9 months ago

This problems piles on the fact Windows guests do not recognise the disk as being SSD without manually supplying the discard=on option through VBoxManage, despite the option being checked through the GUI.

Would it be possible to capitalise on fth0 experience by:

Propagating the SSD options from the GUI to attached storage devices?
Maybe force activating the Host I/O Cache option on the controller when at least one of its storage devices is SSD? Or move the SSD option on the controller, allowing either none or all the storage devices on a controller to be SSD?

Last edited 9 months ago by beer (previous) (diff)

Note: See TracTickets for help on using tickets.

Download in other formats: