Re: [PATCH v3 2/2] nvme: handle persistent internal error AER from NVMe controller

From: Christoph Hellwig
Date: Tue Jun 07 2022 - 06:36:06 EST


On Mon, Jun 06, 2022 at 05:15:15PM -0700, Michael Kelley wrote:
> +static void nvme_handle_aer_persistent_error(struct nvme_ctrl *ctrl)
> +{
> + trace_nvme_async_event(ctrl, NVME_AER_ERROR);
> +
> + /*
> + * We can't read the CSTS here because we're in an atomic context on
> + * some transports and the read may require submitting a request to the
> + * to the controller and getting a response. Such a sequence isn't
> + * likely to be successful anyway if the controller is reporting a
> + * persistent internal error. So assume CSTS.CFS is set.
> + */
> + if (nvme_should_reset(ctrl, NVME_CSTS_CFS)) {
> + dev_warn(ctrl->device, "resetting controller due to AER\n");
> + nvme_reset_ctrl(ctrl);

I don't think we even need the nvme_should_reset check now.

nvme_reset_ctrl first calls nvme_change_ctrl_state, which only allows
the transition to the RESETTING state if it previously was NEW or LIVE,
so we are already covered. The only downside would be an extra kernel
message if we already were in another state.