Re: [RFC PATCH v2 1/8] block: introduce BLK_FEAT_WRITE_ZEROES_UNMAP to queue limits features
From: Zhang Yi
Date: Fri Feb 07 2025 - 07:33:58 EST
On 2025/2/7 20:22, Zhang Yi wrote:
> On 2025/1/29 0:46, John Garry wrote:
>> On 15/01/2025 11:46, Zhang Yi wrote:
>>> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
>>>
>>> Currently, it's hard to know whether the storage device supports unmap
>>> write zeroes. We cannot determine it only by checking if the disk
>>> supports the write zeroes command, as for some HDDs that do submit
>>> actual zeros to the disk media even if they claim to support the write
>>> zeroes command, but that should be very slow.
>>
>> This second sentence is too long, such that your meaning is hard to understand.
>>
>>>
>>> Therefor, add a new queue limit feature, BLK_FEAT_WRITE_ZEROES_UNMAP and
>>
>> Therefore?
>>
>>> the corresponding sysfs entry, to indicate whether the block device
>>> explicitly supports the unmapped write zeroes command. Each device
>>> driver should set this bit if it is certain that the attached disk
>>> supports this command.
>>
>> How can they be certain? You already wrote that some claim to support it, yet don't really. Well, I think that is what you meant.
>>
>
> Hi, John. thanks for your reply!
>
> Sorry for the late and not make it clear enough earlier. Currently, there
> are four situations of write zeroes command (aka REQ_OP_WRITE_ZEROES)
> supported by various disks and backend storage devices.
>
> A. Devices that do not support the write zeroes command
> These devices have bdev_limits(bdev)->max_write_zeroes_sectors set to
> zero.
> B. Devices that support the write zeroes command
> These devices have bdev_limits(bdev)->max_write_zeroes_sectors set to a
> non-zero value. They can be further categorized into three
> sub-situations:
> B.1. Devices that write physical zeroes to the media
> These devices perform the write zeroes operation by physically writing
> zeroes to the storage media, which can be very slow (e.g., HDDs).
> B.2. Devices that support unmap write zeroes
> These devices can offload the write zeroes operation by unmapping the
> logical blocks, effectively putting them into a deallocated state
> (e.g., SSDs). This operation is typically very fast, allowing
> filesystems to use this command to quickly create zeroed files. NVMe
> and SCSI disk drivers already support this and can query the attached
> disks to determine whether they support unmap write zeroes (please see
> patches 2 and 3 for details).
> B.3. The implementation of write zeroes on disks are unknown
> This category includes non-standard disks and some network storage
> devices where the exact implementation of the write zeroes command is
> unclear.
>
> Currently, users can only distinguish A and B through querying
>
> /sys/block/<disk>/queue/write_zeroes_unmap
^^^^^^^^^^^^^^^^^^
Oh, sorry, it should be 'write_zeroes_max_bytes'
/sys/block/<disk>/queue/write_zeroes_max_bytes
Thanks,
Yi.