Re: [PATCH] cxl/region: Remove lock from memory notifier callback

From: Dan Williams
Date: Tue Aug 13 2024 - 16:29:40 EST


Ira Weiny wrote:
> In testing Dynamic Capacity Device (DCD) support, a lockdep splat
> revealed an ABBA issue between the memory notifiers and the DCD extent
> processing code.[0] Changing the lock ordering within DCD proved
> difficult because regions must be stable while searching for the proper
> region and then the device lock must be held to properly notify the DAX
> region driver of memory changes.
>
> Dan points out in the thread that notifiers should be able to trust that
> it is safe to access static data. Region data is static once the device
> is realized and until it's destruction. Thus it is better to manage the
> notifiers within the region driver.
>
> Remove the need for a lock by ensuring the notifiers are active only
> during the region's lifetime.
>
> Link: https://lore.kernel.org/all/66b4cf539a79b_a36e829416@iweiny-mobl.notmuch/ [0]
> Cc: Huang, Ying <ying.huang@xxxxxxxxx>
> Suggested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
> ---
> drivers/cxl/core/region.c | 31 ++++++++++++++++++++-----------
> 1 file changed, 20 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 21ad5f242875..971a314b6b0e 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
[..]
> @@ -2396,7 +2394,6 @@ static int cxl_region_nid(struct cxl_region *cxlr)
> struct cxl_region_params *p = &cxlr->params;
> struct resource *res;
>
> - guard(rwsem_read)(&cxl_region_rwsem);
> res = p->res;
> if (!res)
> return NUMA_NO_NODE;

The cxl_region_nid() helper is now completely unnecessary because not
only is a lock not needed to read cxl_region_params, but p->res is
guaranteed to be non-NULL.

cxl_region_nid() also needs to be killed so that nothing else tries to
use it that might *need* the lock.