On Tue, Jun 21, 2022 at 02:24:01PM +0800, Kefeng Wang wrote:Thanks for your explanation,
On 2022/6/21 13:33, Baoquan He wrote:The problem with splitting is that you can end up with two entries in
On 06/13/22 at 04:09pm, Zhen Lei wrote:Is there some conclusion or discussion that arm64 can't split large page
If the crashkernel has both high memory above DMA zones and low memoryUgh, this looks a little ugly, honestly.
in DMA zones, kexec always loads the content such as Image and dtb to the
high memory instead of the low memory. This means that only high memory
requires write protection based on page-level mapping. The allocation of
high memory does not depend on the DMA boundary. So we can reserve the
high memory first even if the crashkernel reservation is deferred.
This means that the block mapping can still be performed on other kernel
linear address spaces, the TLB miss rate can be reduced and the system
performance will be improved.
If that's for sure arm64 can't split large page mapping of linear
region, this patch is one way to optimize linear mapping. Given kdump
setting is necessary on arm64 server, the booting speed is truly
impacted heavily.
mapping?
Could the crashkernel reservation (and Kfence pool) be splited dynamically?
I found Mark replay "arm64: remove page granularity limitation from
KFENCE"[1],
"We also avoid live changes from block<->table mappings, since the
archtitecture gives us very weak guarantees there and generally requires
a Break-Before-Make sequence (though IIRC this was tightened up
somewhat, so maybe going one way is supposed to work). Unless it's
really necessary, I'd rather not split these block mappings while
they're live."
the TLB for the same VA->PA mapping (e.g. one for a 4KB page and another
for a 2MB block). In the lucky case, the CPU will trigger a TLB conflict
abort (but can be worse like loss of coherency).
Prior to FEAT_BBM (added in ARMv8.4), such scenario was not allowed at
all, the software would have to unmap the range, TLBI, remap. With
FEAT_BBM (level 2), we can do this without tearing the mapping down but
we still need to handle the potential TLB conflict abort. The handler
only needs a TLBI but if it touches the memory range being changed it
risks faulting again. With vmap stacks and the kernel image mapped in
the vmalloc space, we have a small window where this could be handled
but we probably can't go into the C part of the exception handling
(tracing etc. may access a kmalloc'ed object for example).
Another option is to do a stop_machine() (if multi-processor at that
point), disable the MMUs, modify the page tables, re-enable the MMU but
it's also complicated.