Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping
From: Atish Patra
Date: Tue Jun 16 2020 - 17:52:19 EST
On Sun, Jun 14, 2020 at 10:35 PM Alex Ghiti <alex@xxxxxxxx> wrote:
>
> Hi Atish,
>
> Le 6/12/20 Ã 1:43 PM, Atish Patra a Ãcrit :
> > On Fri, Jun 12, 2020 at 6:17 AM Alex Ghiti <alex@xxxxxxxx> wrote:
> >> Le 6/12/20 Ã 8:59 AM, Alex Ghiti a Ãcrit :
> >>> Hi Atish,
> >>>
> >>> Le 6/11/20 Ã 1:29 PM, Atish Patra a Ãcrit :
> >>>> On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti<alex@xxxxxxxx> wrote:
> >>>>> Hi Atish,
> >>>>>
> >>>>> Le 6/10/20 Ã 2:32 PM, Atish Patra a Ãcrit :
> >>>>>> On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti<alex@xxxxxxxx> wrote:
> >>>>>>> This small patchset intends to use PUD/PGDIR entries for linear
> >>>>>>> mapping
> >>>>>>> in order to better utilize TLB.
> >>>>>>>
> >>>>>>> At the moment, only PMD entries can be used since on common platforms
> >>>>>>> (qemu/unleashed), the kernel is loaded at DRAM + 2MB which
> >>>>>>> dealigns virtual
> >>>>>>> and physical addresses and then prevents the use of PUD/PGDIR
> >>>>>>> entries.
> >>>>>>> So the kernel must be able to get those 2MB for PAGE_OFFSET to map
> >>>>>>> the
> >>>>>>> beginning of the DRAM: this is achieved in patch 1.
> >>>>>>>
> >>>>>> I don't have in depth knowledge of how mm code works so this question
> >>>>>> may be a completely
> >>>>>> stupid one :). Just for my understanding,
> >>>>>> As per my understanding, kernel will map those 2MB of memory but
> >>>>>> never use it.
> >>>>>> How does the kernel ensure that it doesn't allocate any memory from
> >>>>>> those 2MB
> >>>>>> memory if it is not marked as reserved?
> >>>>> Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
> >>>>> stage to mark this region
> >>>>> as reserved if there is something there (like opensbi). Otherwise, the
> >>>>> kernel will indeed try to
> >>>>> allocate memory from there :)
> >>>>>
> >>>> In that case, this patch mandates that the firmware region has to be
> >>>> mark "reserved"
> >>>> the device tree so that the Linux kernel doesn't try to allocate
> >>>> memory from there.
> >>>> OpenSBI is already doing it from v0.7. Thus, any user using latest
> >>>> OpenSBI can leverage
> >>>> this patch for a better TLB utilization.
> >>>
> >>> Note that *currently* OpenSBI v0.7 still adds the "no-map" property
> >>> which prevents such optimization.
> >>>
> > Thanks for the clarification. When I said latest, I meant including
> > your patch in the mailing list.
> >
> >>>> However, legacy previous boot stages(BBL) do not reserve this area via
> >>>> DT which may
> >>>> result in an unexpected crash. I am not sure how many developers still
> >>>> use BBL though.
> >>>>
> >>>> Few general suggestions to tackle this problem:
> >>>> 1. This mandatory requirement should be added to the booting document
> >>>> so that any other
> >>>> SBI implementation is also aware of it.
> >>>> 2. You may have to move the patch1 to a separate config so that any
> >>>> users of legacy boot stages
> >>>> can disable this feature.
> >>>
> >>> IMHO, the region occupied by runtime services should be marked as
> >>> reserved in the device-tree. So it seems redundant to add this as a
> >>> requirement, I would rather consider its absence as a bug.
> >>>
> > I agree. I was just suggesting to document this bug :).
>
> Oh ok then, we meant the same thing :)
> >
> >>> Even if I understand that this might break some system, I don't like
> >>> the idea of a new config to support old "buggy" bootloaders: when will
> >>> we be able to remove it ? We'll never know when people will stop using
> >>> those bootloaders, so it will stay here forever...Where can I find the
> > Personally, I am fine with that. However, there were few concerns in the past.
> > I am leaving it to Palmer to decide.
> >
> > @Palmer Dabbelt : Any thoughts ?
> >
> >>> boot document you are talking about ? Can we simply state here that
> >>> this kernel version will not be compatible with those bootloaders
> >>> (we'll draw an exhaustive list here) ?
> > Yes.
> >
> >> Ok, I have just found Documentation/riscv/boot-image-header.rst: could
> >> we imagine doing something like incrementing the version and use that as
> >> a hint in the kernel not to map the 2MB offset ? That's still legacy,
> >> but at least it does not require to recompile a kernel as the check
> >> would be done at runtime.
> >>
> > I was suggesting to add a risc-v specific booting document and
> > document this "bug".
> > Documentation/riscv/boot-image-header.rst can be linked from that document or
> > the boot hader content can be included in that. No changes in code is necessary.
> >
> > Eventually, this booting document will also include other additional
> > booting constraints for RISC-V
> > such as minimum extension required to boot Linux, csr state upon
> > entering S-mode, mmu state.
>
>
> Ok I will prepare a boot document that links to the existing documents and
> add all of that, I will need you for the last constraints that I don't
> know about.
>
Thanks. I will send patches on top of your boot document patch.
This was long due. Thanks for coming up with the initial version :).
> Thanks Atish,
>
> Alex
>
> >>> Alex
> >>>
> >>>
> >>>>> Alex
> >>>>>
> >>>>>
> >>>>>>> But furthermore, at the moment, the firmware (opensbi) explicitly
> >>>>>>> asks the
> >>>>>>> kernel not to map the region it occupies, which is on those common
> >>>>>>> platforms at the very beginning of the DRAM and then it also dealigns
> >>>>>>> virtual and physical addresses. I proposed a patch here:
> >>>>>>>
> >>>>>>> https://github.com/riscv/opensbi/pull/167
> >>>>>>>
> >>>>>>> that removes this 'constraint' but *not* all the time as it offers
> >>>>>>> some
> >>>>>>> kind of protection in case PMP is not available. So sometimes, we may
> >>>>>>> have a part of the memory below the kernel that is removed creating a
> >>>>>>> misalignment between virtual and physical addresses. So for
> >>>>>>> performance
> >>>>>>> reasons, we must at least make sure that PMD entries can be used:
> >>>>>>> that
> >>>>>>> is guaranteed by patch 1 too.
> >>>>>>>
> >>>>>>> Finally the second patch simply improves best_map_size so that
> >>>>>>> whenever
> >>>>>>> possible, PUD/PGDIR entries are used.
> >>>>>>>
> >>>>>>> Below is the kernel page table without this patch on a 6G platform:
> >>>>>>>
> >>>>>>> ---[ Linear mapping ]---
> >>>>>>> 0xffffc00000000000-0xffffc00176e00000 0x0000000080200000 5998M
> >>>>>>> PMD D A . . . W R V
> >>>>>>>
> >>>>>>> And with this patchset + opensbi patch:
> >>>>>>>
> >>>>>>> ---[ Linear mapping ]---
> >>>>>>> 0xffffc00000000000-0xffffc00140000000 0x0000000080000000
> >>>>>>> 5G PUD D A . . . W R V
> >>>>>>> 0xffffc00140000000-0xffffc00177000000 0x00000001c0000000 880M
> >>>>>>> PMD D A . . . W R V
> >>>>>>>
> >>>>>>> Alexandre Ghiti (2):
> >>>>>>> riscv: Get memory below load_pa while ensuring linear mapping
> >>>>>>> is PMD
> >>>>>>> aligned
> >>>>>>> riscv: Use PUD/PGDIR entries for linear mapping when possible
> >>>>>>>
> >>>>>>> arch/riscv/include/asm/page.h | 8 ++++
> >>>>>>> arch/riscv/mm/init.c | 69
> >>>>>>> +++++++++++++++++++++++++++++------
> >>>>>>> 2 files changed, 65 insertions(+), 12 deletions(-)
> >>>>>>>
> >>>>>>> --
> >>>>>>> 2.20.1
> >>>>>>>
> >>>>>>>
> >
> >
--
Regards,
Atish