RE: [PATCH v3] driver core: platform: expose numa_node to users in sysfs

From: Song Bao Hua (Barry Song)
Date: Mon Jun 22 2020 - 07:24:42 EST


> -----Original Message-----
> From: John Garry
> Sent: Monday, June 22, 2020 10:49 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>;
> gregkh@xxxxxxxxxxxxxxxxxxx; rafael@xxxxxxxxxx
> Cc: Robin Murphy <robin.murphy@xxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>; Linuxarm <linuxarm@xxxxxxxxxx>
> Subject: Re: [PATCH v3] driver core: platform: expose numa_node to users in
> sysfs
>
> On 19/06/2020 04:00, Barry Song wrote:
> > Some platform devices like ARM SMMU are memory-mapped and populated
> by ACPI/IORT.
> > In this case, NUMA topology of those platform devices are exported by
> firmware as
> > well. Software might care about the numa_node of those devices in order to
> achieve
> > NUMA locality.
>

Thanks for your review, John.

> Is it generally the case that the SMMU will be in the same NUMA node as
> the endpoint device (which you're driving)? If so, we can get this info


This could be true, but I am not sure if it has to be true :-)

On the other hand, drivers/acpi/arm64/iort.c has some code to set numa node for smmu.
It doesn't assume the numa_node is directly same with the pci devices.

static int __init arm_smmu_v3_set_proximity(struct device *dev,
struct acpi_iort_node *node)
{
struct acpi_iort_smmu_v3 *smmu;

smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) {
int dev_node = acpi_map_pxm_to_node(smmu->pxm);

if (dev_node != NUMA_NO_NODE && !node_online(dev_node))
return -EINVAL;

set_dev_node(dev, dev_node);
pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n",
smmu->base_address,
smmu->pxm);
}
return 0;
}

numa_node may also extend to other platform devices once we provide a common dev_set_proximity() callback to them.
iort_add_platform_device() will set node for them:

static int __init iort_add_platform_device(struct acpi_iort_node *node,
const struct iort_dev_config *ops)
{
struct fwnode_handle *fwnode;
struct platform_device *pdev;
struct resource *r;
int ret, count;

pdev = platform_device_alloc(ops->name, PLATFORM_DEVID_AUTO);
if (!pdev)
return -ENOMEM;

if (ops->dev_set_proximity) {
ret = ops->dev_set_proximity(&pdev->dev, node);
if (ret)
goto dev_put;
}
...
}

It is probably worth to make dev_set_proximity() common for all iort devices.

> from sysfs already for the endpoint, and also have a link from the
> endpoint to the iommu for pci devices (which I assume you're interested in):
>

> root@(none)$ ls -l /sys/devices/pci0000:74/0000:74:02.0/ | grep iommu
> lrwxrwxrwx 1 root root 0 Jun 22 10:33 iommu ->
> ../../platform/arm-smmu-v3.2.auto/iommu/smmu3.0x0000000140000000
> lrwxrwxrwx 1 root root 0 Jun 22 10:33 iommu_group ->
> ../../../kernel/iommu_groups/0
> root@(none)$

Sure there is an implicit way to figure out the numa node of smmu by various links between smmu
and devices which use the smmu if smmu and devices are luckily put in one same numa node.

However, it is still much more clear and credible to users by exposing the data directly from ACPI table.

>
> Thanks,
> John

Barry