Re: Overlapping ioremap() calls, set_memory_*() semantics

From: Toshi Kani
Date: Tue Mar 08 2016 - 18:37:22 EST

On Tue, 2016-03-08 at 13:16 +0100, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@xxxxxxx> wrote:
> > > So where is the problem? The memtype implementation and hence most
> > > ioremap() users are supposed to be safe. set_memory_*() APIs are
> > > supposed
> > > to be safe as well, as they too go via the memtype API.
> >
> > Let me try to summarize...
> >
> > The original issue Luis brought up was that drivers written to work
> > with MTRR may create a single ioremap range covering multiple cache
> > attributes since MTRR can overwrite cache attribute of a certain range.
> > ÂConverting such drivers with PAT-based ioremap interfaces, i.e.
> > ioremap_wc() and ioremap_nocache(), requires a separate ioremap map for
> > each cache attribute, which can be challenging as it may result in
> > overlapping ioremap ranges (in his term) with different cache
> > attributes.
> >
> > So, Luis asked about 'sematics of overlapping ioremap()' calls. ÂHence,
> > I responded that aliasing mapping itself is supported, but alias with
> > different cache attribute is not. ÂWe have checks in place to detect
> > such condition. ÂOverlapping ioremap calls with a different cache
> > attribute either fails or gets redirected to the existing cache
> > attribute on x86.
> Ok, fair enough!
> So to go back to the original suggestion from Luis, I've quoted it, but
> with a s/overlapping/aliased substitution:
> > I had suggested long ago then that one possible resolution was for us
> > to add an API that *enables* aliased ioremap() calls, and only use it
> > on select locations in the kernel. This means we only have to convert a
> > few users to that call to white list such semantics, and by default
> > we'd disable aliased calls. To kick things off -- is this strategy
> > agreeable for all other architectures?
> I'd say that since the overwhelming majority of ioremap() calls are not
> aliased, ever, thus making it 'harder' to accidentally alias is probably
> a good idea.

Did you mean 'aliased' or 'aliased with different cache attribute'? ÂThe
former check might be too strict.

> The memtype infrastructure of phyisical memory ranges in that case acts
> as a security measure, to avoid unintended (not just physically
> incompatible) aliasing. I suspect it would be helpful during driver
> development as well.

The memtype infrastructure does not track caller interfaces. ÂSo, it will
check against to any map, i.e. kernel & user map. ÂI assume a kernel map is
created before user map, though.

> What extra API are you thinking about to enable aliasing in the few cases
> where it's justified?

I'll defer this for Luis...

> the other problem listed:
> > As another problem case, set_memor_*() will not fail on MMIO even
> > though set_memor_*() is designed only for RAM.
> So what does this mean exactly? Having WB caching on MMIO area is not
> good, but UC, WC and WB sure is still sensible in some cases, right?

I responded to Luis in other email that:
| Drivers use ioremap family with a right cache type when mapping MMIO
| ranges, ex. ioremap_wc().ÂÂThey do not need to change the type to MMIO.
| RAM is different since it's already mapped with WB at boot-time.
| set_memory_*() allows us to change the type from WB, and put it back to
| WB.

> > [...] If the above strategy on avoiding aliasing is agreeable, could
> > the next step, or an orthogonal step be to error out on set_memory_*()
> > on IO memory?
> Well, do we have drivers that nevertheless change caching attributes on
> MMIO areas?

Not sure. ÂWe will need to check all callers of set_memory_xx() if we
change it to fail on MMIO.

> Basically if ioremap_uc() and ioremap_wc() is allowed on MMIO areas, then
> I see no reason in principle why it should be invalid to change the area
> from UC to WC after it has been ioremap()ed.

The current implementation does not support MMIO.
Â- It does not track cache attribute correctly for MMIO since it uses
Â- It only supports attribute transition of {WB -> NewType -> WB} for RAM.
ÂRAM is tracked differently that WB is treated as "no map". ÂSo, this
transition does not cause a conflict on RAM. ÂThis will causes a conflict
on MMIO when it is tracked correctly. ÂÂ