Re: [PATCH -V2] cxl/region: Support to calculate memory tier abstract distance

From: Huang, Ying
Date: Sun Jun 16 2024 - 22:12:11 EST


Alison Schofield <alison.schofield@xxxxxxxxx> writes:

> On Tue, Jun 11, 2024 at 01:54:23PM +0800, Ying Huang wrote:

[snip]

>> ---
>> drivers/cxl/core/region.c | 40 +++++++++++++++++++++++++++++++++++----
>> drivers/cxl/cxl.h | 2 ++
>> 2 files changed, 38 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 3c2b6144be23..81d0910c0a02 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -9,6 +9,7 @@
>> #include <linux/uuid.h>
>> #include <linux/sort.h>
>> #include <linux/idr.h>
>> +#include <linux/memory-tiers.h>
>> #include <cxlmem.h>
>> #include <cxl.h>
>> #include "core.h"
>> @@ -2304,14 +2305,20 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
>> return true;
>> }
>>
>> +static int cxl_region_nid(struct cxl_region *cxlr)
>> +{
>> + struct cxl_region_params *p = &cxlr->params;
>> + struct cxl_endpoint_decoder *cxled = p->targets[0];
>> + struct cxl_decoder *cxld = &cxled->cxld;
>> +
>> + return phys_to_target_node(cxld->hpa_range.start);
>> +}
>> +
>
> I believe it's OK to send a resource_size_t to phys_to_target_node()
> like this:
>
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2308,10 +2308,8 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
> static int cxl_region_nid(struct cxl_region *cxlr)
> {
> struct cxl_region_params *p = &cxlr->params;
> - struct cxl_endpoint_decoder *cxled = p->targets[0];
> - struct cxl_decoder *cxld = &cxled->cxld;
>
> - return phys_to_target_node(cxld->hpa_range.start);
> + return phys_to_target_node(p->res->start);
> }
>

Read the related code again, it appears that there's a theoretical race
condition here. The register_memory_notifier() is called in
devm_cxl_add_region(), where p->targets[] and p->res haven't been
setupped yet. And, IIUC, p->targets[] or p->res may be gone during the
life cycle of regions too. If so, we need to use
guard(rwsem_read)(&cxl_region_rwsem) to protect p->targets[] and p->res
references. Because the memory notifier may be called for other nodes
online/offline.

--
Best Regards,
Huang, Ying