Re: [RFC PATCH 0/3] arm64: support FEAT_BBM level 2 and large block mapping when rodata=full
From: Will Deacon
Date: Wed Dec 11 2024 - 17:30:45 EST
Hey,
On Tue, Dec 10, 2024 at 11:33:16AM -0800, Yang Shi wrote:
> On 12/10/24 3:31 AM, Will Deacon wrote:
> > On Mon, Nov 18, 2024 at 10:16:07AM -0800, Yang Shi wrote:
> > > When rodata=full kernel linear mapping is mapped by PTE due to arm's
> > > break-before-make rule.
> > >
> > > This resulted in a couple of problems:
> > > - performance degradation
> > > - more TLB pressure
> > > - memory waste for kernel page table
> > >
> > > There are some workarounds to mitigate the problems, for example, using
> > > rodata=on, but this compromises the security measurement.
> > >
> > > With FEAT_BBM level 2 support, splitting large block page table to
> > > smaller ones doesn't need to make the page table entry invalid anymore.
> > > This allows kernel split large block mapping on the fly.
> > I think you can still get TLB conflict aborts in this case, so this
> > doesn't work. Hopefully the architecture can strengthen this in the
> > future to give you what you need.
>
> Thanks for responding. This is a little bit surprising. I thought FEAT_BBM
> level 2 can handle the TLB conflict gracefully. At least its description
> made me assume so. And Catalin also mentioned FEAT_BBM level 2 can be used
> to split vmemmap page table in HVO patch discussion
> (https://lore.kernel.org/all/Zo68DP6siXfb6ZBR@xxxxxxx/).
>
> It sounds a little bit contradicting if the TLB conflict still can happen
> with FEAT_BBM level 2. It makes the benefit of FEAT_BBM level 2 much less
> than expected.
You can read the Arm ARM just as badly as I can :)
| I_HYQMB
|
| If any level is supported and the TLB entries are not invalidated after
| the writes that modified the translation table entries are completed,
| then a TLB conflict abort can be generated because in a TLB there might
| be multiple translation table entries that all translate the same IA.
Note *any level*.
Furthermore:
| R_FWRMB
|
| If all of the following apply, then a TLB conflict abort is reported
| to EL2:
| * Level 1 or level 2 is supported.
| * Stage 2 translations are enabled in the current translation regime.
| * A TLB conflict abort is generated due to changing the block size or
| Contiguous bit.
I think this series is trying to handle some of this:
https://lore.kernel.org/r/20241211154611.40395-1-miko.lenczewski@xxxxxxx
> Is it out of question to handle the TLB conflict aborts? IIUC we should just
> need flush TLB then resume, and it doesn't require to hold any locks as
> well.
See my reply here:
https://lore.kernel.org/r/20241211210243.GA17155@willie-the-truck
> And I chatted with our architects, I was told the TLB conflict abort doesn't
> happen on AmpereOne. Maybe this is why I didn't see the problem when I
> tested the patches.
I'm actually open to having an MIDR-based lookup for this if its your own
micro-architecture.
Will