Re: [PATCH net] net/smc: use the correct ndev to find pnetid by pnetid table

From: Guangguan Wang
Date: Fri Jan 10 2025 - 01:39:23 EST




On 2025/1/9 00:00, Alexandra Winter wrote:
>
>
> On 07.01.25 20:32, Halil Pasic wrote:
>> On Tue, 7 Jan 2025 09:44:30 +0100
>> Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
>>
>>> On 12/27/24 5:04 AM, Guangguan Wang wrote:
> ...
>>>> The command 'smc_pnet -a -I <ethx> <pnetid>' will add <pnetid>
>>>> to the pnetid table and will attach the <pnetid> to net device
>>>> whose name is <ethx>. But When do SMCR by <ethx>, in function
>>>> smc_pnet_find_roce_by_pnetid, it will use <ethx>'s base ndev's
>>>> pnetid to match rdma device, not <ethx>'s pnetid. The asymmetric
>>>> use of the pnetid seems weird. Sometimes it is difficult to know
>>>> the hierarchy of net device what may make it difficult to configure
>>>> the pnetid and to use the pnetid. Looking into the history of
>>>> commit, it was the commit 890a2cb4a966 ("net/smc: rework pnet table")
>>>> that changes the ndev from the <ethx> to the <ethx>'s base ndev
>>>> when finding pnetid by pnetid table. It seems a mistake.
>>>>
>>>> This patch changes the ndev back to the <ethx> when finding pnetid
>>>> by pnetid table.
>>>>
>>>> Fixes: 890a2cb4a966 ("net/smc: rework pnet table")
>>>> Signed-off-by: Guangguan Wang <guangguan.wang@xxxxxxxxxxxxxxxxx>
>>>
>>> If I read correctly, this will break existing applications using the
>>> lookup schema introduced by the blamed commit - which is not very
>>> recent.
>
>
> I agree that this patch may break existing applications or existing
> configuration automation scripts.
>
> ...
>> PNETID stands for "Physical Network Identifier" and the idea is that iff
>> two ports are connected to the same physical network then they should
>> have the same PNETID. And on s390 PNETID can come and often is comming
>> "from the hardware".
> ...
>
>
> HW pnetids (smc_pnetid_by_dev_port()) are only visible at the base netdevice.
> It seems that the pnetid table, managed by the smc_pnet tool, tries to mimick
> that behaviour, and the concept (recommendation?) would be to set the
> user-defined pnetid also for the base netdevice and then use the upper
> level netdevices for the tcp connection. Which makes some sense,
> all upper level devices have the same connectivity as the base device.
>
> So this patch would break a setup that follows this concept and only sets the
> pnetid at the base netdevice.
>
Hi Alexandra,

See the examples of using smc-r in container on cloud environment here
https://lore.kernel.org/linux-s390/3ff078e0-150d-41ba-b705-a8e0365f0370@xxxxxxxxxxxxx/T/#t.

Especially the example of using veth in container, it can not successfully match the expected
RDMA device when do SMC-R in POD, if follow the concept of the HW pnetid and only sets the
pnetid at the base netdevice.

Maybe it is time to extend the concept and usage of pnetid table?

>
> Optionally you can set a user-defined pnetid on upper level devices (maybe for
> usability??), but as Guangguan noticed, that has no practical impact.
> In the documentation I see examples where the same pnetid is set for upper
> and base device.
> You cannot set a user-defined pnetid on a upper level device, if the base
> device has a HW pnetid (smc_pnet_add_eth()) which makes some sense,
> not even the same pnetid (makes less sense IMO).

> However you can set different user-defined pnetids on the upper netdevices
> and the base device, which makes no sense to me.
>
>>>
>>> Perhaps for a net patch would be better to support both lookup schemas
>>> i.e.
>>>
>>> (smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid) ||
>>> smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid))
>>>
>>> ?
>
> This may yield undesirable results, if base pnetid and upper pnetid differ..
> But then maybe GIGO?
>
>
> ...
>> BTW to implement the logic proposed by you Paolo, as understood by me,
>> we would have to use "&&" instead of "||".
> +1
>
>
> Another idea may be to change the setting of a user-defined pnetid
> on an upper level netdevice to
> - fail if the base netdevice has a different pnetid
> - set the pnetid of the base device , if it is currently unset.
>
In the example of veth, the base ndev found through the upper level ndev in POD
is not the real "base device" describe here.
And I wonder whether it is confused if set a user-defined pnetid to one netdevice,
but list another netdevice when smc_pnet show. For example set pnetid ABC to eth1,
whose base netdev is eth0, but when smc_pnet show, only can find eth0 with pnetid
ABC.

Thanks,
Guangguan Wang