Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.

From: Sven Peter
Date: Sat Sep 25 2021 - 16:11:07 EST


Hi,

I actually ran into a similar issue while adding support for the NVMe
controller found on the M1 and assumed it was only present there.

Some background why this happens: ANS2 is a co-processor that emulates
an NVMe MMIO interface and uses the tag as an index to an internal data
structure. On the M1 we can directly talk to ANS2 and while we can submit
commands with a higher index it'll just ignore the upper bits and only
return the lowest eight IIRC in the completion queue.
I guess whatever software is running on the T2 actually has an assert to
ensure that the tag is within the limits before forwarding the command
to ANS2.

I haven't tested the attached patch yet but my WIP tree has an almost identical
one that works fine.



Best,

Sven


On Sat, Sep 25, 2021, at 21:54, Keith Busch wrote:
> On Sat, Sep 25, 2021 at 11:47:08AM -0700, Linus Torvalds wrote:
>> On Fri, Sep 24, 2021 at 9:02 PM Aditya Garg <gargaditya08@xxxxxxxx> wrote:
>> >
>> > From: Aditya Garg <gargaditya08@xxxxxxxx>
>> > Date: Fri, 24 Sep 2021 15:36:45 +0530
>> > Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
>> > Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)
>>
>> I think we need to hear more about the problem than just revert a
>> commit like this randomly. That commit has already been picked up for
>> -stable,
>>
>> What are the exact symptoms, and which Apple SSD is this?
>>
>> I do find this:
>>
>> https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/
>>
>> which instead of a revert has an actual patch. Can you try that one?
>>
>> Keith Busch replied to that one, saying that the Apple SSD might not
>> be spec compliant, but hey, what else is new? If we start demanding
>> that hardware comply with specs, we'd have to scrap the whole notion
>> of working in the real world. Plus it would be very hypocritical of
>> us, since we ignore all specs when we deem them too limiting (whether
>> they be language specs, POSIX OS specs, or whatever).
>
> Right, we have a lot of quirks for the apple controllers, what's one
> more? :)
>
> Could the following patch be tried? I'm basing this off the 'lspci' from
> Orlando, but I'm assuming the previous model has the same limitation,
> too.
>
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 7efb31b87f37..f0787233557f 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
> blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
> {
> struct nvme_command *cmd = nvme_req(req)->cmd;
> + struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
> blk_status_t ret = BLK_STS_OK;
>
> if (!(req->rq_flags & RQF_DONTPREP)) {
> @@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns,
> struct request *req)
> return BLK_STS_IOERR;
> }
>
> - nvme_req(req)->genctr++;
> + if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
> + nvme_req(req)->genctr++;
> cmd->common.command_id = nvme_cid(req);
> trace_nvme_setup_cmd(req, cmd);
> return ret;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 9871c0c9374c..b49761d30df7 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -86,6 +86,12 @@ enum nvme_quirks {
> */
> NVME_QUIRK_NO_DEEPEST_PS = (1 << 5),
>
> + /*
> + * The controller requires the command_id value be be limited to the
> + * queue depth.
> + */
> + NVME_QUIRK_SKIP_CID_GEN = (1 << 6),
> +
> /*
> * Set MEDIUM priority on SQ creation
> */
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index b82492cd7503..d9f22ed68185 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
> { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
> .driver_data = NVME_QUIRK_SINGLE_VECTOR |
> NVME_QUIRK_128_BYTES_SQES |
> - NVME_QUIRK_SHARED_TAGS },
> + NVME_QUIRK_SHARED_TAGS ,
> + NVME_QUIRK_SKIP_CID_GEN },
> + { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
> + .driver_data = NVME_QUIRK_SKIP_CID_GEN },
>
> { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
> { 0, }
> --

--
Sven Peter