Re: Creating 16 MB super-sections for MMIO

From: Mason
Date: Wed Dec 03 2014 - 12:48:06 EST

On 03/12/2014 18:06, Arnd Bergmann wrote:

Mason wrote:

As far as I could tell, Linux does not create a super-section in the
case outlined above. Perhaps I misread the source code?

I believe you are right, and I also agree that in theory implementing
what you say (both 64k and 16M mappings) can only help, but it's not
obvious if this makes a measurable difference in the end.

It will be an interesting thought experiment to come up with
a relevant benchmark. TODO.

MMIO register accesses are usually slow for other reasons, and
they tend to be rare,

Reading e.g. the system tick counter on this SoC takes ~65 ns
(so ~65 cycles from the CPU's PoV) which is roughly twice as
fast as accessing uncached RAM.

I don't think we can say that MMIO registers accesses are slow
when they are faster than RAM, right?

so it's possible that you won't be able
to ever tell a difference because the MMIO TLB often gets evicted
by user mappings between accesses to different 1MB sections,
and the timing difference between a TLB-hot and cold MMIO access
might not be that great (depending on the latency of a particular

I don't know if other SoCs are built differently, but on this one,
most drivers are hammering the same 16MB memory region where the
MMIO registers live. I don't think the entry would ever get evicted
if there's some kind of LRU-policy in action.

[Seems it might worthwhile to investigate TLB entry lockdown
(on Cortex A9) after all.]

I don't think there would be any objections to doing superpage
or supersection mappings for early page tables if you can show
any benefit whatsoever, but it may be hard to come up with a
scenario where it's actually measurable.

I'll have to think about it.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at