Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.

From: Keith Busch
Date: Sat Sep 25 2021 - 15:54:55 EST


On Sat, Sep 25, 2021 at 11:47:08AM -0700, Linus Torvalds wrote:
> On Fri, Sep 24, 2021 at 9:02 PM Aditya Garg <gargaditya08@xxxxxxxx> wrote:
> >
> > From: Aditya Garg <gargaditya08@xxxxxxxx>
> > Date: Fri, 24 Sep 2021 15:36:45 +0530
> > Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
> > Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
>
> I think we need to hear more about the problem than just revert a
> commit like this randomly. That commit has already been picked up for
> -stable,
>
> What are the exact symptoms, and which Apple SSD is this?
>
> I do find this:
>
> https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
>
> which instead of a revert has an actual patch. Can you try that one?
>
> Keith Busch replied to that one, saying that the Apple SSD might not
> be spec compliant, but hey, what else is new? If we start demanding
> that hardware comply with specs, we'd have to scrap the whole notion
> of working in the real world. Plus it would be very hypocritical of
> us, since we ignore all specs when we deem them too limiting (whether
> they be language specs, POSIX OS specs, or whatever).

Right, we have a lot of quirks for the apple controllers, what's one
more? :)

Could the following patch be tried? I'm basing this off the 'lspci' from
Orlando, but I'm assuming the previous model has the same limitation,
too.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7efb31b87f37..f0787233557f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
{
struct nvme_command *cmd = nvme_req(req)->cmd;
+ struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
blk_status_t ret = BLK_STS_OK;

if (!(req->rq_flags & RQF_DONTPREP)) {
@@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
return BLK_STS_IOERR;
}

- nvme_req(req)->genctr++;
+ if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
+ nvme_req(req)->genctr++;
cmd->common.command_id = nvme_cid(req);
trace_nvme_setup_cmd(req, cmd);
return ret;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9871c0c9374c..b49761d30df7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -86,6 +86,12 @@ enum nvme_quirks {
*/
NVME_QUIRK_NO_DEEPEST_PS = (1 << 5),

+ /*
+ * The controller requires the command_id value be be limited to the
+ * queue depth.
+ */
+ NVME_QUIRK_SKIP_CID_GEN = (1 << 6),
+
/*
* Set MEDIUM priority on SQ creation
*/
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b82492cd7503..d9f22ed68185 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
.driver_data = NVME_QUIRK_SINGLE_VECTOR |
NVME_QUIRK_128_BYTES_SQES |
- NVME_QUIRK_SHARED_TAGS },
+ NVME_QUIRK_SHARED_TAGS ,
+ NVME_QUIRK_SKIP_CID_GEN },
+ { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
+ .driver_data = NVME_QUIRK_SKIP_CID_GEN },

{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ 0, }
--