Re: [PATCH 2/2] PCI: mediatek: Add controller support for MT7629

From: Jianjun Wang
Date: Fri Jun 28 2019 - 02:38:26 EST


On Tue, 2019-02-19 at 23:03 +0800, Lorenzo Pieralisi wrote:
> On Tue, Feb 19, 2019 at 03:01:39PM +0800, Jianjun Wang wrote:
> > On Wed, 2019-01-23 at 15:40 +0000, Lorenzo Pieralisi wrote:
> > > On Mon, Dec 24, 2018 at 07:40:28PM +0800, Jianjun Wang wrote:
> > > > On Thu, 2018-12-20 at 12:20 -0600, Bjorn Helgaas wrote:
> > > > > On Tue, Dec 18, 2018 at 05:19:24PM +0800, Jianjun Wang wrote:
> > > > > > On Mon, 2018-12-17 at 15:46 +0000, Lorenzo Pieralisi wrote:
> > > > > > > On Mon, Dec 17, 2018 at 08:32:47AM -0600, Bjorn Helgaas wrote:
> > > > > > > > On Mon, Dec 17, 2018 at 04:19:39PM +0800, Jianjun Wang wrote:
> > > > > > > > > On Thu, 2018-12-13 at 08:55 -0600, Bjorn Helgaas wrote:
> > > > > > > > > > On Thu, Dec 06, 2018 at 09:09:13AM +0800, Jianjun Wang wrote:
> > > > > > > > > > > The read value of BAR0 is 0xffff_ffff, it's size will be
> > > > > > > > > > > calculated as 4GB in arm64 but bogus alignment values at
> > > > > > > > > > > arm32, the pcie device and devices behind this bridge will
> > > > > > > > > > > not be enabled. Fix it's BAR0 resource size to guarantee
> > > > > > > > > > > the pcie devices will be enabled correctly.
> > > > > > > > > >
> > > > > > > > > > So this is a hardware erratum? Per spec, a memory BAR has
> > > > > > > > > > bit 0 hardwired to 0, and an IO BAR has bit 1 hardwired to
> > > > > > > > > > 0.
> > > > > > > > >
> > > > > > > > > Yes, it only works properly on 64bit platform.
> > > > > > > >
> > > > > > > > I don't understand. BARs are supposed to work the same
> > > > > > > > regardless of whether it's a 32- or 64-bit platform. If this is
> > > > > > > > a workaround for a hardware defect, please just say that
> > > > > > > > explicitly.
> > > > > > >
> > > > > > > I do not understand this either. First thing to do is to describe
> > > > > > > the problem properly so that we can actually find a solution to
> > > > > > > it.
> > > > > >
> > > > > > This BAR0 is a 64-bit memory BAR, the HW default values for this BAR
> > > > > > is 0xffff_ffff_0000_0000 and it could not be changed except by
> > > > > > config write operation.
> > > > >
> > > > > If you literally get 0xffff_ffff_0000_0000 when reading the BAR, that
> > > > > is out of spec because the low-order 4 bits of a 64-bit memory BAR
> > > > > cannot all be zero.
> > > > >
> > > > > A 64-bit BAR consumes two DWORDS in config space. For a 64-bit BAR0,
> > > > > the DWORD at 0x10 contains the low-order bits, and the DWORD at 0x14
> > > > > contains the upper 32 bits. Bits 0-3 of the low-order DWORD (the
> > > > > one at 0x10) are read-only, and in this case should contain the value
> > > > > 0b1100 (0xc). That means the range is prefetchable (bit 3 == 1) and
> > > > > the BAR is 64 bits (bits 2:1 == 10).
> > > >
> > > > Sorry, I have confused the HW default value and the read value of BAR
> > > > size. The hardware default value is 0xffff_ffff_0000_000c, it's a 64-bit
> > > > BAR with prefetchable range.
> > > >
> > > > When we start to decoding the BAR, the read value of BAR0 at 0x10 is
> > > > 0x0c, and the value at 0x14 is 0xffff_ffff, so the read value of BAR
> > > > size is 0xffff_ffff_0000_0000, which will be decoded to 0xffff_ffff, and
> > > > it will be set to the end value of BAR0 resource in the pci_dev.
> > > > >
> > > > > > The calculated BAR size will be 0 in 32-bit platform since the
> > > > > > phys_addr_t is a 32bit value in 32-bit platform.
> > > > >
> > > > > Either (1) this is a hardware defect that feeds incorrect data to the
> > > > > BAR size calculation, or (2) there's a problem in the BAR size
> > > > > calculation code. We need to figure out which one and work around or
> > > > > fix it correctly.
> > > >
> > > > The BAR size is calculated by the code (res->end - res->start + 1) is
> > > > fine, I think it's a hardware defect because that we can not change the
> > > > hardware default value or just disable it since we don't using it.
> > >
> > > Apologies for the delay in getting back to this.
> > >
> > > This looks like a kernel defect, not a HW defect.
> > >
> > > I need some time to make up my mind on what the right fix for this
> > > but it is most certainly not this patch.
> > >
> > > Lorenzo
> >
> > Hi Lorenzo,
> >
> > Is there any better idea about this patch?
>
> Hi,
>
> I did not have time to investigate the issue in core code that triggers
> this defect but this patch is not the solution to the problem it is a
> plaster that papers over it, I won't merge it.
>
> I would appreciate some help. If you could have a look at core code that
> triggers the failure we can analyze what should be done to make it work,
> I do not think it is a defect in your IP.
>
> Lorenzo

Hi Lorenzo,

This BAR size issue has been fixed by commit
"01b37f851ca150554496fd6e79c6d9a67992a2c0
PCI: Make pci_size() return real BAR size"

So there is no need to add the fixup method, I will remove it in next
version.

Thanks.