Re: [PATCH] Do not silently discard WRITE_SAME requests

From: Petr Vandrovec
Date: Sun Oct 12 2014 - 02:18:15 EST


This is a multi-part message in MIME format. On 10/11/2014 5:51 AM, Martin K. Petersen wrote:
"Petr" == Petr Vandrovec <petr@xxxxxxxxxx> writes:

Petr> After investigating, problem seems to be in a way completion
Petr> handler for WRITE_SAME handles EOPNOTSUPP error, causing
Petr> all-but-first WRITE_SAME request on the LVM device to be silently
Petr> ignored - command is never issued, but success is returned to
Petr> higher layers.

Commit 7eee4ae2dbb2 was meant to address this issue. Does it still
happen with that in place?

Hi,
that commit alleviates need for change to blk-lib.c. But I believe that change to blk-core.c that changes return value from EOPNOTSUPP to EREMOTEIO is still necessary - unless I'm missing some locking somewhere, there is a race in blkdev_issue_write_same() wrt. updating max_write_same_sectors:

blkdev_issue_write_same() checks whether max_write_same_sectors is non-zero at the beginning, and if it is non-zero it proceeds with generating BIOs. While it generates them, other thread seems to be able to complete previously issued write_same, find it is not supported, and clear max_write_same_sectors. Which means that BIOs that are now being generated will fast-fail in blk-core.c with EOPNOTSUPP, and blkdev_issue_write_same() will then return success, rather than failure.

It is true that now WRITE_SAME is failing only if second WRITE_SAME is issued to the device while first ever issued WRITE_SAME on the device is being completed, but I see no reason why to not close this race.

Logic (from 2011, commit 8af1954d172a46a63e5e79dae523a6d74715e458) says that EOPNOTSUPP is returned when DISCARD request failed, as discarding is optional, and failures can be safely ignored. That is definitely not true for WRITE_SAME failures, and so unsupported WRITE_SAME should return different error code than unsupported DISCARD.

Which is what patch does. I've removed part that propagates disabling WRITE_SAME from the diff, keeping only EOPNOTSUPP => EREMOTEIO change, and revert of blacklisting VMware's LSI (if anything, blacklist should be for current firmware version of 'VMware Virtual SCSI Disk', as f.e. passed-through (RDM) SCSI disks do support WRITE_SAME under VMware) -- see attached updated diff.
Petr Vandrovec