RE: [PATCH] PCI: layerscape: Add the SRIOV support in host side
From: Xiaowei Bao
Date: Wed Dec 04 2019 - 21:56:57 EST
> -----Original Message-----
> From: Robin Murphy <robin.murphy@xxxxxxx>
> Sent: 2019å12æ4æ 19:59
> To: Xiaowei Bao <xiaowei.bao@xxxxxxx>; Marc Zyngier <maz@xxxxxxxxxx>
> Cc: Roy Zang <roy.zang@xxxxxxx>; lorenzo.pieralisi@xxxxxxx;
> devicetree@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; Z.q. Hou
> <zhiqiang.hou@xxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; M.h. Lian
> <minghuan.lian@xxxxxxx>; robh+dt@xxxxxxxxxx;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; bhelgaas@xxxxxxxxxx;
> andrew.murray@xxxxxxx; frowand.list@xxxxxxxxx; Mingkai Hu
> <mingkai.hu@xxxxxxx>
> Subject: Re: [PATCH] PCI: layerscape: Add the SRIOV support in host side
>
> On 2019-12-04 4:34 am, Xiaowei Bao wrote:
> >
> >
> >> -----Original Message-----
> >> From: Robin Murphy <robin.murphy@xxxxxxx>
> >> Sent: 2019å12æ3æ 23:20
> >> To: Marc Zyngier <maz@xxxxxxxxxx>; Xiaowei Bao
> <xiaowei.bao@xxxxxxx>
> >> Cc: Roy Zang <roy.zang@xxxxxxx>; lorenzo.pieralisi@xxxxxxx;
> >> devicetree@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; Z.q. Hou
> >> <zhiqiang.hou@xxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; M.h. Lian
> >> <minghuan.lian@xxxxxxx>; robh+dt@xxxxxxxxxx;
> >> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; bhelgaas@xxxxxxxxxx;
> >> andrew.murray@xxxxxxx; frowand.list@xxxxxxxxx; Mingkai Hu
> >> <mingkai.hu@xxxxxxx>
> >> Subject: Re: [PATCH] PCI: layerscape: Add the SRIOV support in host
> >> side
> >>
> >> On 03/12/2019 11:51 am, Marc Zyngier wrote:
> >>> On 2019-12-03 01:42, Xiaowei Bao wrote:
> >>>>> -----Original Message-----
> >>>>> From: Marc Zyngier <maz@xxxxxxxxxxxxxxx>
> >>>>> Sent: 2019å12æ2æ 20:48
> >>>>> To: Xiaowei Bao <xiaowei.bao@xxxxxxx>
> >>>>> Cc: robh+dt@xxxxxxxxxx; frowand.list@xxxxxxxxx; M.h. Lian
> >>>>> <minghuan.lian@xxxxxxx>; Mingkai Hu <mingkai.hu@xxxxxxx>; Roy
> >> Zang
> >>>>> <roy.zang@xxxxxxx>; lorenzo.pieralisi@xxxxxxx;
> >>>>> andrew.murray@xxxxxxx; bhelgaas@xxxxxxxxxx;
> >>>>> devicetree@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >>>>> linux-pci@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> >>>>> Z.q. Hou <zhiqiang.hou@xxxxxxx>
> >>>>> Subject: Re: [PATCH] PCI: layerscape: Add the SRIOV support in
> >>>>> host side
> >>>>>
> >>>>> On 2019-12-02 10:45, Xiaowei Bao wrote:
> >>>>>> GIC get the map relations of devid and stream id from the msi-map
> >>>>>> property of DTS, our platform add this property in u-boot base on
> >>>>>> the PCIe device in the bus, but if enable the vf device in
> >>>>>> kernel, the vf device msi-map will not set, so the vf device
> >>>>>> can't work, this patch purpose is that manage the stream id and
> >>>>>> device id map relations dynamically in kernel, and make the new
> >>>>>> PCIe device work in
> >> kernel.
> >>>>>>
> >>>>>> Signed-off-by: Xiaowei Bao <xiaowei.bao@xxxxxxx>
> >>>>>> ---
> >>>>>> Â drivers/of/irq.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 9
> +++
> >>>>>> Â drivers/pci/controller/dwc/pci-layerscape.c | 94
> >>>>>> +++++++++++++++++++++++++++++
> >>>>>> Â drivers/pci/probe.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 6
> ++
> >>>>>> Â drivers/pci/remove.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 6
> ++
> >>>>>> Â 4 files changed, 115 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/of/irq.c b/drivers/of/irq.c index
> >>>>>> a296eaf..791e609 100644
> >>>>>> --- a/drivers/of/irq.c
> >>>>>> +++ b/drivers/of/irq.c
> >>>>>> @@ -576,6 +576,11 @@ void __init of_irq_init(const struct
> >>>>>> of_device_id
> >>>>>> *matches)
> >>>>>> ÂÂÂÂÂ }
> >>>>>> Â }
> >>>>>>
> >>>>>> +u32 __weak ls_pcie_streamid_fix(struct device *dev, u32 rid) {
> >>>>>> +ÂÂÂ return rid;
> >>>>>> +}
> >>>>>> +
> >>>>>> Â static u32 __of_msi_map_rid(struct device *dev, struct
> >>>>>> device_node **np,
> >>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ u32 rid_in)
> >>>>>> Â {
> >>>>>> @@ -590,6 +595,10 @@ static u32 __of_msi_map_rid(struct device
> >>>>>> *dev, struct device_node **np,
> >>>>>> ÂÂÂÂÂÂÂÂÂ if (!of_map_rid(parent_dev->of_node, rid_in,
> >>>>>> "msi-map",
> >>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ "msi-map-mask", np, &rid_out))
> >>>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂ break;
> >>>>>> +
> >>>>>> +ÂÂÂ if (rid_out == rid_in)
> >>>>>> +ÂÂÂÂÂÂÂ rid_out = ls_pcie_streamid_fix(parent_dev, rid_in);
> >>>>>
> >>>>> Over my dead body. Get your firmware to properly program the LUT
> >>>>> so that it presents the ITS with a reasonable topology. There is
> >>>>> absolutely no way this kind of change makes it into the kernel.
> >>>>
> >>>> Sorry for this, I know it is not reasonable, but I have no other
> >>>> way, as I know, ARM get the mapping of stream ID to request ID from
> >>>> the msi-map property of DTS, if add a new device which need the
> >>>> stream ID and try to get it from the msi-map of DTS, it will failed
> >>>> and not work, yes? So could you give me a better advice to fix this
> >>>> issue, I would really appreciate any comments or suggestions, thanks a
> lot.
> >>>
> >>> Why can't firmware expose an msi-map/msi-map-mask that has a large
> >>> enough range to ensure mapping of VFs? What are the limitations of
> >>> the LUT that would prevent this from being configured before the
> >>> kernel boots?
> >
> > Thanks for your comments, yes, this is the root cause, we only have 16
> > stream IDs for PCIe domain, this is the hardware limitation, if there
> > have enough stream IDs, we can expose an msi-map/msi-map-mask for all
> > PCIe devices in system, unfortunately, the stream IDs is not enough, I
> > think other ARM vendor have same issue that they don't have enough
> stream IDs.
>
> Some SMMUv2 configurations may have an uncomfortably limited number of
> context banks, but they almost always have more than enough stream ID bits.
> Your ICID allocation policy is most certainly an issue unique to Layerscape
> platforms.
I am not sure which version of SMMU used in our Layerscape platform, I think the
SMMU version is 1 in our early board. So there is not enough stream ID for PCIe or
other peripherals.
>
> Furthermore, that argument doesn't make a whole lot of sense anyway - if
> you don't have enough stream IDs for all possible VFs at boot time, then you
> still won't have enough later, so pretending to support SR-IOV, only for things
> to start subtly going wrong if the user has too many endpoints active at once,
> isn't going to cut it.
Thanks Robin, yes, agree your point, I think I have to drop this patch, the purpose
of submitting this patch to opensource is that to know whether have a best way
to fix this issue, but due to the limitation of hardware, it looks like have no better
way, in a word, thanks a lot.
>
> >> Furthermore, note that this attempt isn't doing anything for the SMMU
> >> Stream IDs, so the moment anyone tries to assign those VFs they're
> >> still going to go bang anyway. Any firmware-based fixup for ID
> >> mappings, config space addresses, etc. needs to be SR-IOV-aware and
> >> account for all *possible* BDFs.
> >>
> >> On LS2085 at least, IIRC you can configure a single LUT entry to just
> >> translate the Bus:Device identifier and pass some or all of the
> >> Function bits straight through as the LSBs of the Stream ID, so I
> >> don't believe the relatively limited number of LUT registers should
> >> be too much of an issue. For example, last time I hacked on that I
> apparently had it set up statically like this:
> >>
> >> &pcie3 {
> >> /* Squash 8:5:3 BDF down to 2:2:3 */
> >> msi-map-mask = <0x031f>;
> >> msi-map = <0x000 &its 0x00 0x20>,
> >> <0x100 &its 0x20 0x20>,
> >> <0x200 &its 0x40 0x20>,
> >> <0x300 &its 0x60 0x20>;
> >> };
> >
> > Thanks Robin, this is a effective way, but we only have total 16
> > stream IDs for PCIe domain, and only assign 4 stream IDs for each PCIe
> > controller if the board have 4 PCIe controllers, this is the root
> > cause, I submitted this patch to dynamically manage these stream IDs,
> > so that it looks like each PCIe controller has 16 stream IDs. I can
> > dynamically allocate and release these stream IDs based on the PCIe
> devices in the current system. If use your method, we support up to 4 PCIe
> devices(2 PFs and 2 VFs), it will not achieve our purpose.
>
> Sure, that was just an example to illustrate that you don't need a separate
> msi-map entry (and corresponding LUT entry) for each individual PCI RID -
> that dates from before U-Boot had ICID support, so I had hacks all over
> various kernel drivers to set them arbitrarily when I was playing with the
> SMMU.
>
> Realistically, at this point your options are a) reserve more ICIDs for PCIe and
> allocate them in a way that accounts for the present endpoints'
> SR-IOV capabilities, or b) don't expose SR-IOV functionality at all on the root
> complex if it can't be guaranteed to work properly.
Agree, if enable more than 16 PCIe devices, the stream IDs for PCIe are not enough,
the root cause is the limitation of the hardware, I think I have to drop it, thanks a lot.
>
> Robin.