Re: [RFT PATCH v3 12/27] of/address: Add infrastructure to declare MMIO as non-posted

From: Arnd Bergmann
Date: Fri Mar 05 2021 - 11:44:36 EST


On Fri, Mar 5, 2021 at 5:08 PM Andy Shevchenko
<andy.shevchenko@xxxxxxxxx> wrote:
>
> On Fri, Mar 5, 2021 at 5:55 PM Hector Martin <marcan@xxxxxxxxx> wrote:
> > On 06/03/2021 00.13, Andy Shevchenko wrote:
>
> ...
>
> > >> - return ioremap(res.start, resource_size(&res));
> > >> + if (res.flags & IORESOURCE_MEM_NONPOSTED)
> > >> + return ioremap_np(res.start, resource_size(&res));
> > >> + else
> > >> + return ioremap(res.start, resource_size(&res));
> > >
> > > This doesn't sound right. Why _np is so exceptional? Why don't we have
> > > other flavours (it also rings a bell to my previous comment that the
> > > flag in ioresource is not in the right place)?
> >
> > This is different from other variants, because until now *drivers* have
> > made the choice of what ioremap mode to use based on device requirements
> > (which means ioremap() 99% of the time, and then framebuffers and other
> > memory-ish things such use something else). Now we have a *SoC fabric*
> > that is calling the shots on what ioremap mode we have to use - and
> > *every* non-PCIe driver needs to use ioremap_np() on these SoCs, or they
> > break. So it seems a lot cleaner to make the choice for drivers here to
> > upgrade ioremap() to ioremap_np() for SoCs that need it.
>
> Yes, that is a good idea. Once we discussed x86 and _uc cases and
> actually on x86 it makes a lot of sense to have ioremap() ==
> ioremap_uc(). Can't be this a similar case here?

The difference is that ioremap() should be ioremap_uc() on /all/
architectures, it's just that x86 and ia64 for a long time were the
exception and defined ioremap() as 'do whatever the mtrr says'.

> Arnd, what do you think of actually providing an ioremap() as some
> kind of "best for the architecture the code is running on"?

Linus has been pretty clear about wanting all the default functions
to have similar behavior across architectures and be at least as
strict as x86. In case of ioremap() that usually means that writes
are posted, because they are that way on PCI buses on x86.

There are definitely some advantages of making all writes non-posted
by default on Arm because that would be a simpler model, but there
are some important downsides:

- non-posted writes can be much slower than posted ones, depending
on the specific hardware

- it would change the behavior of all Arm platforms, with no easy
way to validate it

- setting ioremap() on PCI buses non-posted only makes them
only slower but not more reliable, because the non-posted flag
on the bus is discarded by the PCI host bridge.

> Otherwise if the same driver happens to be needed on different
> architectures, oops, ifdeffery or simple conditionals over the code is
> really not the best way to solve it.

The behavior of devm_platform_ioremap_resource() is now to
do the right thing automatically, and I think that is good enough. For
all I can tell, we can use that in all drivers without conditional
compilation.

> > If we don't do something like this here or in otherwise common code,
> > we'd have to have an open-coded "if apple then ioremap_np, else ioremap"
> > in every driver that runs on-die devices on these SoCs, even ones that
> > are otherwise standard and need few or no Apple-specific quirks.
>
> Exactly! But what about architectures where _uc is that one? So, why
> does your patch only take part of _np case?
> (Hint we have x86 Device Tree based platforms)

Usually, the driver knows the requirements, and they are independent
of the platform. If a driver wants _wc or _wt mappings, it will want that
on all machines, and should be able to deal with platforms that implement
that through a stricter mapping.

The two drivers that need to override ioremap() to ioremap_uc()
in order to override the mtrr already have that logic. You are right
that these are a bit like the _np() case in that the device needs
something stricter than the default mapping, but the difference is
that for x86 ioremap_uc() this is needed to work around broken
firmware, while for the apple ioremap_np() case we trust the firmware
to tell us what to do.

> Yep, and why not to make ioremap() == ioremap_nc() on architecture
> that requires it?
> Can it be detected at run time?

I think doing this requires auditing a lot of legacy driver code,
especially drivers/video/fbdev. Once all drivers that need write-combining
mappings explicitly ask for them, the default ioremap can be changed
over to act like ioremap_nc() on the remaining two architectures that
don't do that yet.

There has been a lot of work toward that goal, but the problem
is knowing exactly which drivers need it.

Arnd