Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
From: Michael Wang
Date: Fri Apr 24 2015 - 11:07:42 EST
On 04/24/2015 04:44 PM, Liran Liss wrote:
>> From: Michael Wang [mailto:yun.wang@xxxxxxxxxxxxxxxx]
>
>
>> [snip]
[snip]
>>
>
> We don't want to stop code branches that are not abstractions but rather depend
> on the specific technology!
> There is no generic "iWARP CM" - only one.
> There is no generic "ROCE CM" - only one.
> There is no generic "IB CM" - only one.
>
> At the CM high-level (i.e., whether an ib_dev port registers an IB client), you could consider
> an rdma_has_cm() call, but this the only place in the code that this check will be called!
> Hence, no need for a generic check.
There is plenty of places to check IB or IWARP CM in cma, that is 18# and 19#.
And the purpose is to make sure core-layer check the right thing rather than the
'tech' or 'transport', even if there is only one use case, as long as it meaningful, a
helper would make sense IMHO, so later when comes other cases we could use the
helpers rather than the meaningless tech check.
>
> You want to stop abstract code that uses IB core infrastructure.
We want the check on management make more sense :-)
>
>>>
>>>> This new transport is only understand by core-layer currently, for
>>>> user-layer we still reserve the old transport for them, next step is
>>>> to use bitmask instead of transport, at that time we can erase the
>>>> new transport and make the whole stuff used by user-layer only :-)
[snip]
>> We need to classify and integrate the concept into mgmt helper, that would
>> be very helpful for further reform, reform followed by integration sounds not
>> that bad, correct?
>>
>
> The problem is that it is hard to follow the reasoning for the first use consumer
> code with the in-complete helper frame-work.
The problem is, with the current implementation in core layer, expanding
to new tech would be painful, the reform would come into core directly,
which make the logical even more confusing.
This patch set aim to collect all those places where a management check
necessary and integrate them together, this is the reform on core-layer
so later we don't have to change core but replace the internal
implementation of helpers, eg the idea of bitmask.
>
>>>
>>>>>
>>>>> 2)The name rdma_tech_* is lame.
>>>>> rdma_transport_*(), adhering to the above (*) remark, is much better.
>>>>> For example, both IB and ROCE *do* use the same transport.
>>>>
>>>> We have some discussion on that too, use transport means going back...
>>>>
>>>
>>> No.
>>> The existing notion of transport was correct. It was the node type that
>> wasn't.
>>> And in any case the new helpers didn't use it.
>>>
>>> We need the original meaning of transport - see my response to Ira.
>>> I propose replacing rdma_node_get_transport() with the following helpers:
>>> - rdma_get_transport()
>>> - rdma_is_ib_transport()
>>> - rdma_is_iwarp_transport()
>>
>> We can change the name at anytime, tech/transport/protocol/standard, just
>> one patch later can easily change it and start the topic of naming, any of
>> these name will unsatisfied someone AFAIK, I'd like to suggest we consider
>> this as a mark temporarily and focus on the logical issue.
>
> Sure.
>
> The logical issue is:
>
> 1. We need the existing notion of transport, meaning "a bunch of L4+headers + semantics presented to apps".
> 2. We might need an *additional* notion of "rdma_protocol", which designates a complete wire-format: L2-L4+ including.
> This could be later a bitmask, a management helper, whatever.
> Currently, I don't see anything in the existing code that would call such helpers.
My thought is mostly focus on the core-layer, so the name won't satisfied
someone focus on hw or user app, but as long as the reform won't break
anything, I prefer to leave the naming to those who more expert on
that particular part, after this first step :-)
Regards,
Michael Wang
>
>>
>>> - ...
>>>
>>>>>
>>>>> 3) The name cap_* as it is used above is not accurate.
>>>>> You use it to describe technology characteristics rather than
>>>>> extendable
>>>> capabilities.
>>>>> I would suggest having a single convention for all helpers, such as
>>>> rdma_has_*() and rdma_is_*().
>>>>> For example: cap_ib_smi() ==> rdma_has_smi().
>>>>
>>>> That means going back too...
>>>
>>> See response to Ira (https://lkml.org/lkml/2015/4/21/951).
>>>
>>>
>>>>
>>>>>
>>>>> 4) Remove all capabilities that do not introduce any distinction in
>>>>> the
>>>> current code.
>>>>> We can add them as needed later.
>>>>> This means remove patches:
>>>>> - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() â all
>>>>> IB devices support ipoib
>>>>> - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() â all
>>>>> IB
>>>> devices support AF_IB.
>>>>>
>>>>> On the other hand:
>>>>> - rdma_has_multicast() makes sense, since iWARP doesnât support it.
>>>>> - cap_ib_sa() might make sense to cut code even further in the CMA,
>>>>> since
>>>> RoCE has a GSI but no SA.
>>>>
>>>> We have discussion on define these helpers previously, again, name is
>>>> not really a problem, I would rather to see such changes in the
>>>> following series after this one working stably :-)
>>>>
>>>
>>> The names are not critical. This comment is about introducing helpers
>>> that are do not introduce any new semantic notion in the current patch-set.
>>>
>>> cap_ipoib(), for example, is brain-dead because only a single
>>> technology (as of now) enables it: Infiniband.
>>
>> This will be dropped in next version :-)
>>
>>>
>>>>>
>>>>> 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs:
>>>>> Reform IB-core verbs/uverbs_cmd/sysfs It *is* the link layer!
>>>>
>>>> Actually nothing changed after the modify, the prev purpose it to
>>>> eliminate the link layer helpers.
>>>>
>>>> But now we are not going to remove the helper any more, so let's drop
>>>> this modification in next version :-)
>>>>
>>>
>>> You don't add modifications just to drop them later.
>>> Don't add them in the first place!
>>>
>>> This patch-set will remain forever in the kernel commit log - we want
>>> it to be as self-explaining and coherent as possible.
>>>
>>> Remove this.
>>
>> What i mean is this will be removed in v6...
>>
>>>
>>>>>
>>>>> 6) Remove cap_read_multi_sge
>>>>> It is not device/port feature, but a transport capability.
>>>>> Use rdma_is_iwarp_transport() instead, or introduce a new transport
>>>>> flag in
>>>> 'enum ib_device_cap_flags'.
>>>>>
>>>>> 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper
>>>> cap_eth_ah().
>>>>> Address handles that refer to Ethernet links always have Ethernet
>>>> addressing.
>>>>>
>>>>> In the CMA code, using rdma_tech_iboe() is just fine. This is how
>>>>> you define
>>>> cap_eth_ah() anyway.
>>>>> Currently, this patch just adds clutter.
>>>>
>>>> There are also some discussion on these helpers, drop them means
>>>> going back..
>>>>
>>>
>>> Back to where? Management helpers are a new concept. Let's get them
>> right.
>>
>> Back to one point during v1~v5.
>>
>>>
>>>> The tech helper is not enough to explain the management purpose, and
>>>> this can be the wrapper for bitmask stuff too.
>>>>
>>>
>>> As I said, I am not sure that we will need any bitmasks.
>>> Also see response to Ira (https://lkml.org/lkml/2015/4/21/951).
>>
>> Better discussed in another thread.
>>
>>>
>>>>>
>>>>> 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
>>>>> We do need a transport qualifier, as exemplified in comment 5)
>>>>> above, and
>>>> for a complete clean model.
>>>>> This is after renaming the function to rdma_is_ib_transport()...
>>>>
>>>> This means going back again... rdma_is_ib_transport() has been used
>>>> previously.
>>>>
>>>> This helper is just to make the review more easier, we won't need it
>>>> internally, not to mention after bitmask was introduced :-)
>>>>
>>>
>>> The same...
>>>
>>>>>
>>>>>
>>>>> Putting it all together
>>>>> ==================
>>>>>
>>>>> We are left with the following helpers:
>>>>> - rdma_is_ib_transport()
>>>>> - rdma_is_iwarp_transport()
>>>>> - rdma_is_usnic_transport()
>>>>> - rdma_is_iboe()
>>>>> - rdma_has_mad()
>>>>> - rdma_has_smi()
>>>>> - rdma_has_gsi() - complements smi; can be used by the mad code for
>>>>> clarity
>>>>> - rdma_has_sa()
>>>>> - rdma_has_cm()
>>>>> - rdma_has_mcast()
>>>>
>>>> I think we can put the discussion on name and new helpers in future,
>>>> currently let's focus on these basic reform and make them working
>>>> stably ;-)
>>>
>>> It's not just the names, it's their semantics.
>>> Any problems with the names proposed above?
>>
>> These were once used in old version, again, name can't satisfied anyone at
>> this moment and I'd like to discuss this after the logical was right, I really
>> don't want folks to focus on this issue since it won't broken anything and can
>> be easily changed once we have the agreement.
>>
>> Regards,
>> Michael Wang
>>
>
> OK
>
>>>
>>>>
>>>> Regards,
>>>> Michael Wang
>>>>
>>>>>
>>>>>
>>>>>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>>>>>
>>>>>>
>>>>>> Since v4:
>>>>>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>>>>>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>>>>>> * Fix logical issue inside 3#, 14#
>>>>>> * Refine 3#, 4#, 5# with label 'free'
>>>>>> * Rework 10# to stop using port 1 when port already assigned
>>>>>>
>>>>>> There are plenty of lengthy code to check the transport type of IB
>>>>>> device, or the link layer type of it's port, but actually we are
>>>>>> just speculating whether a particular management/feature is
>>>>>> supported by the
>>>> device/port.
>>>>>>
>>>>>> Thus instead of inferring, we should have our own mechanism for IB
>>>>>> management capability/protocol/feature checking, several proposals
>>>> below.
>>>>>>
>>>>>> This patch set will reform the method of getting transport type, we
>>>>>> will now using query_transport() instead of inferring from
>>>>>> transport and link layer respectively, also we defined the new
>>>>>> transport type to make the concept more reasonable.
>>>>>>
>>>>>> Mapping List:
>>>>>> node-type link-layer old-transport new-transport
>>>>>> nes RNIC ETH IWARP
>> IWARP
>>>>>> amso1100 RNIC ETH IWARP IWARP
>>>>>> cxgb3 RNIC ETH IWARP IWARP
>>>>>> cxgb4 RNIC ETH IWARP IWARP
>>>>>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>>>>>> ocrdma IB_CA ETH IB IBOE
>>>>>> mlx4 IB_CA IB/ETH IB IB/IBOE
>>>>>> mlx5 IB_CA IB IB IB
>>>>>> ehca IB_CA IB IB IB
>>>>>> ipath IB_CA IB IB IB
>>>>>> mthca IB_CA IB IB IB
>>>>>> qib IB_CA IB IB IB
>>>>>>
>>>>>> For example:
>>>>>> if (transport == IB) && (link-layer == ETH) will now become:
>>>>>> if (query_transport() == IBOE)
>>>>>>
>>>>>> Thus we will be able to get rid of the respective transport and
>>>>>> link-layer checking, and it will help us to add new
>>>>>> protocol/Technology (like OPA) more easier, also with the
>>>>>> introduced management helpers, IB management logical will be more
>>>>>> clear and easier
>>>> for extending.
>>>>>>
>>>>>> Highlights:
>>>>>> The patch set covered a wide range of IB stuff, thus for those who are
>>>>>> familiar with the particular part, your suggestion would be
>>>>>> invaluable ;-)
>>>>>>
>>>>>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>>>>>> management helpers, 26#~27# do clean up.
>>>>>>
>>>>>> Patches haven't been tested yet, we appreciate if any one who
>>>>>> have
>>>> these
>>>>>> HW willing to provide his Tested-by :-)
>>>>>>
>>>>>> Doug suggested the bitmask mechanism:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> rdma@xxxxxxxxxxxxxxx/msg23765.html
>>>>>> which could be the plan for future reforming, we prefer that to
>>>>>> be
>>>> another
>>>>>> series which focus on semantic and performance.
>>>>>>
>>>>>> This patch-set is somewhat 'bloated' now and it may be a good
>>>>>> timing
>>>> for
>>>>>> staging, I'd like to suggest we focus on improving existed
>>>>>> helpers and
>>>> push
>>>>>> all the further reforms into next series ;-)
>>>>>>
>>>>>> Proposals:
>>>>>> Sean:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> rdma@xxxxxxxxxxxxxxx/msg23339.html
>>>>>> Doug:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> rdma@xxxxxxxxxxxxxxx/msg23418.html
>>>>>> https://www.mail-archive.com/linux-
>>>>>> rdma@xxxxxxxxxxxxxxx/msg23765.html
>>>>>> Jason:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> rdma@xxxxxxxxxxxxxxx/msg23425.html
>>>>>>
>>>>>> Michael Wang (27):
>>>>>> IB/Verbs: Implement new callback query_transport()
>>>>>> IB/Verbs: Implement raw management helpers
>>>>>> IB/Verbs: Reform IB-core mad/agent/user_mad
>>>>>> IB/Verbs: Reform IB-core cm
>>>>>> IB/Verbs: Reform IB-core sa_query
>>>>>> IB/Verbs: Reform IB-core multicast
>>>>>> IB/Verbs: Reform IB-ulp ipoib
>>>>>> IB/Verbs: Reform IB-ulp xprtrdma
>>>>>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>>>>>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>>>>>> IB/Verbs: Reform route related part in IB-core cma
>>>>>> IB/Verbs: Reform mcast related part in IB-core cma
>>>>>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>>>>>> IB/Verbs: Reform cma_acquire_dev()
>>>>>> IB/Verbs: Reform rest part in IB-core cma
>>>>>> IB/Verbs: Use management helper cap_ib_mad()
>>>>>> IB/Verbs: Use management helper cap_ib_smi()
>>>>>> IB/Verbs: Use management helper cap_ib_cm()
>>>>>> IB/Verbs: Use management helper cap_iw_cm()
>>>>>> IB/Verbs: Use management helper cap_ib_sa()
>>>>>> IB/Verbs: Use management helper cap_ib_mcast()
>>>>>> IB/Verbs: Use management helper cap_ipoib()
>>>>>> IB/Verbs: Use management helper cap_read_multi_sge()
>>>>>> IB/Verbs: Use management helper cap_af_ib()
>>>>>> IB/Verbs: Use management helper cap_eth_ah()
>>>>>> IB/Verbs: Clean up rdma_ib_or_iboe()
>>>>>> IB/Verbs: Cleanup rdma_node_get_transport()
>>>>>>
>>>>>> ---
>>>>>> drivers/infiniband/core/agent.c | 4
>>>>>> drivers/infiniband/core/cm.c | 26 +-
>>>>>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>>>>>> drivers/infiniband/core/device.c | 1
>>>>>> drivers/infiniband/core/mad.c | 51 ++--
>>>>>> drivers/infiniband/core/multicast.c | 18 -
>>>>>> drivers/infiniband/core/sa_query.c | 41 +--
>>>>>> drivers/infiniband/core/sysfs.c | 8
>>>>>> drivers/infiniband/core/ucm.c | 5
>>>>>> drivers/infiniband/core/ucma.c | 27 --
>>>>>> drivers/infiniband/core/user_mad.c | 32 +-
>>>>>> drivers/infiniband/core/uverbs_cmd.c | 6
>>>>>> drivers/infiniband/core/verbs.c | 33 --
>>>>>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>>>>>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>>>>>> drivers/infiniband/hw/cxgb4/provider.c | 7
>>>>>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>>>>>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>>>>>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>>>>>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>>>>>> drivers/infiniband/hw/mlx4/main.c | 10
>>>>>> drivers/infiniband/hw/mlx5/main.c | 7
>>>>>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>>>>>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>>>>>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>>>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>>>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>>>>>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>>>>>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>>>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>>>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>>>>>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>>>>>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>>>>>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>>>>>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>>>>>> 35 files changed, 584 insertions(+), 368 deletions(-)
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
>>>>>> majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/