[PATCH] Do not silently discard WRITE_SAME requests

From: Petr Vandrovec
Date: Sat Oct 11 2014 - 02:31:14 EST

it was brought to my attention that there are claims of data corruption
caused by VMware's SCSI implementation. After investigating, problem
seems to be in a way completion handler for WRITE_SAME handles EOPNOTSUPP
error, causing all-but-first WRITE_SAME request on the LVM device to be
silently ignored - command is never issued, but success is returned to
higher layers. Problem affects all disks without WRITE_SAME support -
and I guess VMware's SCSI emulation is one of few that do not support
this command ATM.

Please apply patch below.
Petr Vandrovec

From: Petr Vandrovec <petr@xxxxxxxxxx>
Subject: [PATCH] Do not silently discard WRITE_SAME requests

When device does not support WRITE_SAME, after first failure
block layer starts throwing away WRITE_SAME requests without
warning anybody, leading to the data corruption.

Let's do something about it - do not use EOPNOTSUPP error,
as apparently that error code is special (use EREMOTEIO, AKA
target failure, like when request hits hardware), and propagate
inabiity to do WRITE_SAME to the top of stack, so we do not
try to issue WRITE_SAME again and again.

It also reverts 4089b71cc820a426d601283c92fcd4ffeb5139c2, as
there is nothing wrong with VMware's WRITE_SAME emulation.
Only problem was that block layer did not issue WRITE_SAME
request at all, but reported success, and it affected all
disks that do not support WRITE_SAME.

Signed-off-by: Petr Vandrovec <petr@xxxxxxxxxx>
Cc: Arvind Kumar <arvindkumar@xxxxxxxxxx>
Cc: Chris J Arges <chris.j.arges@xxxxxxxxxxxxx>
Cc: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
block/blk-core.c | 2 +-
block/blk-lib.c | 10 ++++++++++
drivers/message/fusion/mptspi.c | 5 -----
3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 9c888bd..b070782 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1822,7 +1822,7 @@ generic_make_request_checks(struct bio *bio)

if (bio->bi_rw & REQ_WRITE_SAME && !bdev_write_same(bio->bi_bdev)) {
- err = -EOPNOTSUPP;
+ err = -EREMOTEIO;
goto end_io;

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 8411be3..abad72d 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -298,6 +298,16 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
return 0;

+ /*
+ * If WRITE_SAME failed, inability to perform WRITE_SAME was
+ * possibly recorded in device's queue by sd.c. But in case
+ * of LVM we are issuing request here on LVM device. So
+ * we should mark device as ineligible for WRITE_SAME here too,
+ * as otherwise we keep trying to submit WRITE_SAME again and
+ * again to LVM where they get promptly rejected by underlying
+ * disk queue.
+ */
+ blk_queue_max_write_same_sectors(bdev_get_queue(bdev), 0);
bdevname(bdev, bdn);
pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 613231c..787933d 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1419,11 +1419,6 @@ mptspi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto out_mptspi_probe;

- /* VMWare emulation doesn't properly implement WRITE_SAME
- */
- if (pdev->subsystem_vendor == 0x15AD)
- sh->no_write_same = 1;
spin_lock_irqsave(&ioc->FreeQlock, flags);

/* Attach the SCSI Host to the IOC structure

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/