[PATCH 4/6] nvme: set discard_granularity from NPDG/NPDA

From: Caleb Sander Mateos

Date: Thu Feb 19 2026 - 22:28:51 EST

Currently, nvme_config_discard() always sets the discard_granularity
queue limit to the logical block size. However, NVMe namespaces can
advertise a larger preferred discard granularity in the NPDG or NPDA
field of the Identify Namespace structure or the NPDGL or NPDAL fields
of the I/O Command Set Specific Identify Namespace structure.

Use these fields to compute the discard_granularity limit. The logic is
somewhat involved. First, the fields are optional. NPDG is only reported
if the low bit of OPTPERF is set in NSFEAT. NPDA is reported if any bit
of OPTPERF is set. And NPDGL and NPDAL are reported if the high bit of
OPTPERF is set. NPDGL and NPDAL can also each be set to 0 to opt out of
reporting a limit. I/O Command Set Specific Identify Namespace may also
not be supported by older NVMe controllers. Another complication is that
multiple values may be reported among NPDG, NPDGL, NPDA, and NPDAL. The
spec says to prefer the values reported in the L variants. The spec says
NPDG should be a multiple of NPDA and NPDGL should be a multiple of
NPDAL, but it doesn't specify a relationship between NPDG and NPDAL or
NPDGL and NPDA. So use the maximum of the reported NPDG(L) and NPDA(L)
values as the discard_granularity.

Signed-off-by: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx>
---
drivers/nvme/host/core.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 70ff14a56a01..7ac11c40ca9f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1880,23 +1880,33 @@ static bool nvme_init_integrity(struct nvme_ns_head *head,
bi->pi_offset = info->pi_offset;
}
return true;
}

-static void nvme_config_discard(struct nvme_ns *ns, struct queue_limits *lim)
+static void nvme_config_discard(struct nvme_ns *ns, struct nvme_id_ns *id,
+ struct nvme_id_ns_nvm *nvm,
+ struct queue_limits *lim)
{
struct nvme_ctrl *ctrl = ns->ctrl;
+ u32 npdg, npda;
+ u8 optperf;

if (ctrl->dmrsl && ctrl->dmrsl <= nvme_sect_to_lba(ns->head, UINT_MAX))
lim->max_hw_discard_sectors =
nvme_lba_to_sect(ns->head, ctrl->dmrsl);
else if (ctrl->oncs & NVME_CTRL_ONCS_DSM)
lim->max_hw_discard_sectors = UINT_MAX;
else
lim->max_hw_discard_sectors = 0;

- lim->discard_granularity = lim->logical_block_size;
+ optperf = id->nsfeat >> NVME_NS_FEAT_OPTPERF_SHIFT &
+ NVME_NS_FEAT_OPTPERF_MASK;
+ npdg = optperf & 0x2 && nvm && nvm->npdgl ? le32_to_cpu(nvm->npdgl) :
+ optperf & 0x1 ? le16_to_cpu(id->npdg) + 1 : 1;
+ npda = optperf & 0x2 && nvm && nvm->npdal ? le32_to_cpu(nvm->npdal) :
+ optperf ? le16_to_cpu(id->npda) + 1 : 1;
+ lim->discard_granularity = max(npdg, npda) * lim->logical_block_size;

if (ctrl->dmrl)
lim->max_discard_segments = ctrl->dmrl;
else
lim->max_discard_segments = NVME_DSM_MAX_RANGES;
@@ -2382,11 +2392,11 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
nvme_configure_metadata(ns->ctrl, ns->head, id, nvm, info);
nvme_set_chunk_sectors(ns, id, &lim);
if (!nvme_update_disk_info(ns, id, &lim))
capacity = 0;

- nvme_config_discard(ns, &lim);
+ nvme_config_discard(ns, id, nvm, &lim);
if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
ns->head->ids.csi == NVME_CSI_ZNS)
nvme_update_zone_info(ns, &lim, &zi);

if ((ns->ctrl->vwc & NVME_CTRL_VWC_PRESENT) && !info->no_vwc)
--
2.45.2