Re: [PATCH v4 0/6] Add Auxiliary driver support

From: Ajit Khaparde
Date: Mon Nov 28 2022 - 21:02:01 EST


On Tue, Nov 22, 2022 at 10:59 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>
> On Tue, Nov 22, 2022 at 07:02:45AM -0800, Ajit Khaparde wrote:
> > On Wed, Nov 16, 2022 at 5:22 AM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> > >
> > ::snip::
> > > > > All PCI management logic and interfaces are needed to be inside eth part
> > > > > of your driver and only that part should implement SR-IOV config. Once
> > > > > user enabled SR-IOV, the PCI driver should create auxiliary devices for
> > > > > each VF. These device will have RDMA capabilities and it will trigger RDMA
> > > > > driver to bind to them.
> > > > I agree and once the PF creates the auxiliary devices for the VF, the RoCE
> > > > Vf indeed get probed and created. But the twist in bnxt_en/bnxt_re
> > > > design is that
> > > > the RoCE driver is responsible for making adjustments to the RoCE resources.
> > >
> > > You can still do these adjustments by checking type of function that
> > > called to RDMA .probe. PCI core exposes some functions to help distinguish between
> > > PF and VFs.
> > >
> > > >
> > > > So once the VF's are created and the bnxt_en driver enables SRIOV adjusts the
> > > > NIC resources for the VF, and such, it tries to call into the bnxt_re
> > > > driver for the
> > > > same purpose.
> > >
> > > If I read code correctly, all these resources are for one PCI function.
> > >
> > > Something like this:
> > >
> > > bnxt_re_probe()
> > > {
> > > ...
> > > if (is_virtfn(p))
> > > bnxt_re_sriov_config(p);
> > > ...
> > > }
> > I understand what you are suggesting.
> > But what I want is a way to do this in the context of the PF
> > preferably before the VFs are probed.
>
> I don't understand the last sentence. You call to this sriov_config in
> bnxt_re driver without any protection from VFs being probed,

Let me elaborate -
When a user sets num_vfs to a non-zero number, the PCI driver hook
sriov_configure calls bnxt_sriov_configure(). Once pci_enable_sriov()
succeeds, bnxt_ulp_sriov_cfg() is issued under bnxt_sriov_configure().
All this happens under bnxt_en.
bnxt_ulp_sriov_cfg() ultimately calls into the bnxt_re driver.
Since bnxt_sriov_configure() is called only for PFs, bnxt_ulp_sriov_cfg()
is called for PFs only.

Once bnxt_ulp_sriov_cfg() calls into the bnxt_re via the ulp_ops,
it adjusts the QPs, SRQs, CQs, MRs, GIDs and such.

>
> > So we are trying to call the
> > bnxt_re_sriov_config in the context of handling the PF's
> > sriov_configure implementation. Having the ulp_ops is allowing us to
> > avoid resource wastage and assumptions in the bnxt_re driver.
>
> To which resource wastage are you referring?
Essentially the PF driver reserves a set of above resources for the PF,
and divides the remaining resources among the VFs.
If the calculation is based on sriov_totalvfs instead of sriov_numvfs,
there can be a difference in the resources provisioned for a VF.
And that is because a user may create a subset of VFs instead of the
total VFs allowed in the PCI SR-IOV capability register.
I was referring to the resource wastage in that deployment scenario.

Thanks
Ajit

>
> There are no differences if same limits will be in bnxt_en driver when
> RDMA bnxt device is created or in bnxt_re which will be called once RDMA
> device is created.
>
> Thanks
>
> >
> > ::snip::
>
>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature