Re: [PATCH v2 00/21] Refine memblock API

From: Lucas Stach
Date: Fri Oct 04 2019 - 09:23:26 EST


Am Freitag, den 04.10.2019, 10:27 +0100 schrieb Russell King - ARM
Linux admin:
> On Thu, Oct 03, 2019 at 02:30:10PM +0300, Mike Rapoport wrote:
> > On Thu, Oct 03, 2019 at 09:49:14AM +0100, Russell King - ARM Linux
> > admin wrote:
> > > On Thu, Oct 03, 2019 at 08:34:52AM +0300, Mike Rapoport wrote:
> > > > (trimmed the CC)
> > > >
> > > > On Wed, Oct 02, 2019 at 06:14:11AM -0500, Adam Ford wrote:
> > > > > On Wed, Oct 2, 2019 at 2:36 AM Mike Rapoport <
> > > > > rppt@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > Before the patch:
> > > > >
> > > > > # cat /sys/kernel/debug/memblock/memory
> > > > > 0: 0x10000000..0x8fffffff
> > > > > # cat /sys/kernel/debug/memblock/reserved
> > > > > 0: 0x10004000..0x10007fff
> > > > > 34: 0x2fffff88..0x3fffffff
> > > > >
> > > > >
> > > > > After the patch:
> > > > > # cat /sys/kernel/debug/memblock/memory
> > > > > 0: 0x10000000..0x8fffffff
> > > > > # cat /sys/kernel/debug/memblock/reserved
> > > > > 0: 0x10004000..0x10007fff
> > > > > 36: 0x80000000..0x8fffffff
> > > >
> > > > I'm still not convinced that the memblock refactoring didn't
> > > > uncovered an
> > > > issue in etnaviv driver.
> > > >
> > > > Why moving the CMA area from 0x80000000 to 0x30000000 makes it
> > > > fail?
> > >
> > > I think you have that the wrong way round.
> >
> > I'm relying on Adam's reports of working and non-working versions.
> > According to that etnaviv works when CMA area is at 0x80000000 and
> > does not
> > work when it is at 0x30000000.
> >
> > He also sent logs a few days ago [1], they also confirm that.
> >
> > [1]
> > https://lore.kernel.org/linux-mm/CAHCN7xJEvS2Si=M+BYtz+kY0M4NxmqDjiX9Nwq6_3GGBh3yg=w@xxxxxxxxxxxxxx/
>
> Sorry, yes, you're right. Still, I've reported this same regression
> a while back, and it's never gone away.
>
> > > > BTW, the code that complained about "command buffer outside
> > > > valid memory
> > > > window" has been removed by the commit 17e4660ae3d7
> > > > ("drm/etnaviv:
> > > > implement per-process address spaces on MMUv2").
> > > >
> > > > Could be that recent changes to MMU management of etnaviv
> > > > resolve the
> > > > issue?
> > >
> > > The iMX6 does not have MMUv2 hardware, it has MMUv1. With MMUv1
> > > hardware requires command buffers within the first 2GiB of
> > > physical
> > > RAM.
> >
> > I've mentioned that patch because it removed the check for cmdbuf
> > address
> > for MMUv1:
> >
> > @@ -785,15 +768,7 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
> > PAGE_SIZE);
> > if (ret) {
> > dev_err(gpu->dev, "could not create command
> > buffer\n");
> > - goto unmap_suballoc;
> > - }
> > -
> > - if (!(gpu->identity.minor_features1 &
> > chipMinorFeatures1_MMU_VERSION) &&
> > - etnaviv_cmdbuf_get_va(&gpu->buffer, &gpu-
> > >cmdbuf_mapping) > 0x80000000) {
> > - ret = -EINVAL;
> > - dev_err(gpu->dev,
> > - "command buffer outside valid memory
> > window\n");
> > - goto free_buffer;
> > + goto fail;
> > }
> >
> > /* Setup event management */
> >
> >
> > I really don't know how etnaviv works, so I hoped that people who
> > understand it would help.
>
> From what I can see, removing that check is a completely insane thing
> to do, and I note that these changes are _not_ described in the
> commit
> message. The problem was known about _before_ (June 22) the patch
> was
> created (July 5).
>
> Lucas, please can you explain why removing the above check, which is
> well known to correctly trigger on various platforms to prevent
> incorrect GPU behaviour, is safe?

It isn't. It's a pretty big oversight in this commit to remove this
check. It can't be done at the same spot in the code anymore, as we
don't have a mapping context at this time anymore, but it should have
moved into etnaviv_iommu_context_init(). I'll send a patch to fix this
up.

Regards,
Lucas