Re: [PATCH 2/4 v3] cxl/core: Add helpers to detect Low memory Holes on x86

From: Fabio M. De Francesco
Date: Sat Mar 29 2025 - 06:05:37 EST


On Saturday, March 29, 2025 12:40:34 AM Central European Standard Time Dan Williams wrote:
> Fabio M. De Francesco wrote:
> > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > ranges are always guaranteed to align to the NIW * 256M rule.
> >
> > In order to construct Regions and attach Decoders, the driver needs to
> > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > the entire process returns errors because it doesn't expect to deal with
> > SPA range lengths smaller than corresponding HPA's.
> >
> > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > with corresponding HPA's. They will be used in the process of Regions
> > creation and Endpoint attachments to prevent driver failures in a few
> > steps of the above-mentioned process.
> >
> > The helpers return true when HPA/SPA misalignments are detected under
> > specific conditions: both the SPA and HPA ranges must start at
> > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > aligned to the NIW * 256M rule.
> >
> > Also introduce a function to adjust the range end of the Regions to be
> > created on x86 with LMH's.
> >
> > Cc: Alison Schofield <alison.schofield@xxxxxxxxx>
> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> > Cc: Ira Weiny <ira.weiny@xxxxxxxxx>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@xxxxxxxxxxxxxxx>
> > ---
> > drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> > drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
> > 2 files changed, 85 insertions(+)
> > create mode 100644 drivers/cxl/core/lmh.c
> > create mode 100644 drivers/cxl/core/lmh.h
> >
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > new file mode 100644
> > index 000000000000..2e32f867eb94
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.c
> > @@ -0,0 +1,56 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> > +#include <linux/range.h>
> > +#include "lmh.h"
> > +
> > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > +#define LMH_CFMWS_RANGE_START 0x0ULL
> > +
> > +/*
> > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > + *
> > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > + * the given endpoint decoder HPA range size is always expected aligned and
> > + * also larger than that of the matching root decoder. If there are LMH's,
> > + * the root decoder range end is always less than SZ_4G.
> > + */
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled)
> > +{
> > + const struct range *r1, *r2;
> > + int niw;
> > +
> > + r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > + r2 = &cxled->cxld.hpa_range;
> > + niw = cxled->cxld.interleave_ways;
> > +
> > + if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > + r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> > + IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > +bool arch_match_region(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld)
> > +{
> > + const struct range *r = &cxld->hpa_range;
> > + const struct resource *res = p->res;
> > + int niw = cxld->interleave_ways;
> > +
> > + if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > + res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> > + IS_ALIGNED(range_len(r), niw * SZ_256M))
> > + return true;
> > +
> > + return false;
> > +}
> > +
> > +void arch_adjust_region_resource(struct resource *res,
> > + struct cxl_root_decoder *cxlrd)
> > +{
> > + res->end = cxlrd->res->end;
> > +}
> > diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> > new file mode 100644
> > index 000000000000..16746ceac1ed
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include "cxl.h"
> > +
> > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > + const struct cxl_endpoint_decoder *cxled);
> > +bool arch_match_region(const struct cxl_region_params *p,
> > + const struct cxl_decoder *cxld);
> > +void arch_adjust_region_resource(struct resource *res,
> > + struct cxl_root_decoder *cxlrd);
> > +#else
> > +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > + struct cxl_endpoint_decoder *cxled)
> > +{
> > + return false;
>
> I would have expected the default match routines to do the default
> matching, not return false.
>
> This can document the common expectation on architectures that do not
> need to account for decoders not aligning to window boundaries due to
> holes.
>
Hi Dan,

A typical example of arch_match_spa() use is from match_root_decoder_by_range()
which returns false on platforms that don't enable support for the low memory hole.
Therefore, the default behavior is failing the matching by returning false.

This is how arch_match_spa() is used to detect a hole and allow the matching:

static int match_root_decoder_by_range(struct device *dev,
const void *data)
{
const struct cxl_endpoint_decoder *cxled = data;
struct cxl_root_decoder *cxlrd;
const struct range *r1, *r2;

if (!is_root_decoder(dev))
return 0;

cxlrd = to_cxl_root_decoder(dev);
r1 = &cxlrd->cxlsd.cxld.hpa_range;
r2 = &cxled->cxld.hpa_range;

if (range_contains(r1, r2))
return true;
if (arch_match_spa(cxlrd, cxled))
return true;

return false;
}

Currently the default behavior is for match_root_decoder_by_range() to
not match root and endpoint decoders. I left that default unchanged for
all platforms and architectures that don't enable LMH support.

Thanks,

Fabio

Attachment: signature.asc
Description: This is a digitally signed message part.