Re: [PATCH -V2] cxl/region: Support to calculate memory tier abstract distance

From: Dave Jiang
Date: Mon Jun 17 2024 - 13:25:53 EST




On 6/16/24 7:10 PM, Huang, Ying wrote:
> Alison Schofield <alison.schofield@xxxxxxxxx> writes:
>
>> On Tue, Jun 11, 2024 at 01:54:23PM +0800, Ying Huang wrote:
>
> [snip]
>
>>> ---
>>> drivers/cxl/core/region.c | 40 +++++++++++++++++++++++++++++++++++----
>>> drivers/cxl/cxl.h | 2 ++
>>> 2 files changed, 38 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>>> index 3c2b6144be23..81d0910c0a02 100644
>>> --- a/drivers/cxl/core/region.c
>>> +++ b/drivers/cxl/core/region.c
>>> @@ -9,6 +9,7 @@
>>> #include <linux/uuid.h>
>>> #include <linux/sort.h>
>>> #include <linux/idr.h>
>>> +#include <linux/memory-tiers.h>
>>> #include <cxlmem.h>
>>> #include <cxl.h>
>>> #include "core.h"
>>> @@ -2304,14 +2305,20 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
>>> return true;
>>> }
>>>
>>> +static int cxl_region_nid(struct cxl_region *cxlr)
>>> +{
>>> + struct cxl_region_params *p = &cxlr->params;
>>> + struct cxl_endpoint_decoder *cxled = p->targets[0];
>>> + struct cxl_decoder *cxld = &cxled->cxld;
>>> +
>>> + return phys_to_target_node(cxld->hpa_range.start);
>>> +}
>>> +
>>
>> I believe it's OK to send a resource_size_t to phys_to_target_node()
>> like this:
>>
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -2308,10 +2308,8 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
>> static int cxl_region_nid(struct cxl_region *cxlr)
>> {
>> struct cxl_region_params *p = &cxlr->params;
>> - struct cxl_endpoint_decoder *cxled = p->targets[0];
>> - struct cxl_decoder *cxld = &cxled->cxld;
>>
>> - return phys_to_target_node(cxld->hpa_range.start);
>> + return phys_to_target_node(p->res->start);
>> }
>>
>
> Read the related code again, it appears that there's a theoretical race
> condition here. The register_memory_notifier() is called in
> devm_cxl_add_region(), where p->targets[] and p->res haven't been
> setupped yet. And, IIUC, p->targets[] or p->res may be gone during the
> life cycle of regions too. If so, we need to use
> guard(rwsem_read)(&cxl_region_rwsem) to protect p->targets[] and p->res
> references. Because the memory notifier may be called for other nodes
> online/offline.

You mind sending a patch? :)

>
> --
> Best Regards,
> Huang, Ying