Re: [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices

From: Anand Khoje
Date: Fri Jun 25 2021 - 02:04:30 EST


On 6/24/2021 11:24 PM, Jason Gunthorpe wrote:
On Wed, Jun 23, 2021 at 06:33:32PM +0530, Anand Khoje wrote:
On 6/22/2021 5:19 AM, Jason Gunthorpe wrote:
On Wed, Jun 16, 2021 at 09:15:09PM +0530, Anand Khoje wrote:
@@ -1523,13 +1524,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
device->port_data[port].cache.lmc = tprops->lmc;
device->port_data[port].cache.port_state = tprops->state;
- device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
+ ret = rdma_query_gid(device, port, 0, &gid);
+ if (ret) {

This is quite a bit different than just calling ops.query_gid() - why
are you changing it? I'm not sure all the additional tests will pass,
the 0 gid entry is not required to be valid..

Hi Jason,

We have opted for rdma_query_gid(), as during ib_cache_update() the code
calls ops.query_gid() earlier in config_non_roce_gid_cache(), thereby
updating the value of GID in cache. We utilize this updated value, instead
of calling ops->query_gid() again.

Uhhhh, so just store the subnet prefix at that point then?

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index c9e9fc81447e89..5c554ebd000e89 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1428,8 +1428,8 @@ int rdma_read_gid_l2_fields(const struct ib_gid_attr *attr,
}
EXPORT_SYMBOL(rdma_read_gid_l2_fields);
-static int config_non_roce_gid_cache(struct ib_device *device,
- u32 port, int gid_tbl_len)
+static int config_non_roce_gid_cache(struct ib_device *device, u32 port,
+ struct ib_port_attr *tprops)
{
struct ib_gid_attr gid_attr = {};
struct ib_gid_table *table;
@@ -1441,7 +1441,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
table = rdma_gid_table(device, port);
mutex_lock(&table->lock);
- for (i = 0; i < gid_tbl_len; ++i) {
+ for (i = 0; i < tprops->gid_tbl_len; ++i) {
if (!device->ops.query_gid)
continue;
ret = device->ops.query_gid(device, port, i, &gid_attr.gid);
@@ -1452,6 +1452,8 @@ static int config_non_roce_gid_cache(struct ib_device *device,
goto err;
}
gid_attr.index = i;
+ tprops->subnet_prefix =
+ be64_to_cpu(gid_attr.global.subnet_prefix);
add_modify_gid(table, &gid_attr);
}
err:
@@ -1484,7 +1486,7 @@ ib_cache_update(struct ib_device *device, u32 port, bool update_gids,
if (!rdma_protocol_roce(device, port) && update_gids) {
ret = config_non_roce_gid_cache(device, port,
- tprops->gid_tbl_len);
+ tprops);
if (ret)
goto err;
}


Hi Jason,

Thanks for the response!

If the above change is to be made, there could arise a scenario in which:
In case of a cache_update event, another application/module could try to call ib_query_port() and read subnet_prefix while the cache is still getting updated and the application/module could end up reading a stale value of subnet_prefix.

I have a few questions:
- How likely is it that an up and running Infiniband fabric would change the subnet_prefix?
- Is it possible that different GIDs in the gid_table will have different values of subnet_prefix?

And I would much prefer things be re-organized so the cache can be
valid sooner to adding this variable. What is the earlier call that is
motivating this?

During device load and when cache is yet to be updated, ib_query_port()
should have a mechanism to identify if the cache entry is valid or invalid
(uninitialized), we have added this variable just to ensure the validity of
cache.

Unless there is an actual user of ib_query_port() before
config_non_roce_gid_cache() that I can't see, don't bother, returning
0 is fine.

Jason


Hm! that makes sense, with the above change we wouldn't need to call device->ops.query_gid() from __ib_query_port() and can always read subnet_prefix using ib_get_cached_subnet_prefix(), if reading stale value during cache update event is not an issue.

Thanks,
Anand