Re: [RFC PATCH v1 2/5] arm64: Add BBM Level 2 cpu feature

From: Yang Shi
Date: Fri Dec 13 2024 - 17:35:22 EST




On 12/13/24 8:17 AM, Mikołaj Lenczewski wrote:
+static int do_conflict_abort(unsigned long far, unsigned long esr,
+ struct pt_regs *regs)
+{
+ if (!system_supports_bbml2())
+ return do_bad(far, esr, regs);
+
+ /* if we receive a TLB conflict abort, we know that there are multiple
+ * TLB entries that translate the same address range. the minimum set
+ * of invalidations to clear these entries is implementation defined.
+ * the maximum set is defined as either tlbi(vmalls12e1) or tlbi(alle1).
+ *
+ * if el2 is enabled and stage 2 translation enabled, this may be
+ * raised as a stage 2 abort. if el2 is enabled but stage 2 translation
+ * disabled, or if el2 is disabled, it will be raised as a stage 1
+ * abort.
+ *
+ * local_flush_tlb_all() does a tlbi(vmalle1), which is enough to
+ * handle a stage 1 abort.
+ */
+
+ local_flush_tlb_all();
+
+ return 0;
+}
Can we actually guarantee that we make it this far without taking another
abort? Given that I'm yet to see one of these things in the wild, I'm
fairly opposed to pretending that we can handle them. We'd be much better
off only violating BBM on CPUs that are known to handle the conflict
gracefully. Judging by your later patch, this is practically keyed off
the MIDR _anyway_...

Will

Hi Mikołaj,

Thanks for reviewing. Apologies for the delay in responding, and for
spam (replied instead of group-replied).

There should not be an option to take another fault while performing the
handler, as long as the mappings covering the fault handler table or any
code in this path are not screwed with. This is discussed further in the
resent patch series [1].

Will lead me to this thread when we discussed about my series (https://lore.kernel.org/lkml/20241211223034.GA17836@willie-the-truck/#t). My series tried to use large mapping for kernel linear mapping when rodata=full. The common part of the both series is BBM lv2 support. My series depends on BBM lv2 to change page table block size when changing permission for kernel linear mapping.

Handling TLB conflict should be ok for your usecase since the conflict should just happen in user address space. But it may be not fine for my usecase. At least the TLB conflict handler needs to depend on CONFIG_VMAP_STACK otherwise the kernel stack pointer points to kernel linear address space. I'm not quite sure whether there is any other corner case which can trigger recursive abort in the path or not, maybe bad stack handler? So Will suggested MIDR based lookup. IIUC, he suggested just enable BBM lv2 support on the CPUs which can handle TLB conflict gracefully. This should make our lives easier. But the downside is maybe fewer CPUs can actually advertise BBM lv2 support.

Thanks,
Yang

The MIDR revisions will be fixed. I was confused as to which revisions
were affected on an earlier version of the series, and had missed
updating them. The kconfig workarounds should be correct in this regard.

[1]:https://lore.kernel.org/all/084c5ada-51af-4c1a-b50a-4401e62ddbd6@xxxxxxx/