Re: [PATCH] arch/riscv: enable XIP

From: Vitaly Wool
Date: Wed Dec 02 2020 - 14:44:55 EST


On Wed, Dec 2, 2020 at 7:06 PM Nicolas Pitre <nico@xxxxxxxxxxx> wrote:
>
> On Wed, 2 Dec 2020, Vitaly Wool wrote:
>
> > Introduce XIP (eXecute In Place) support for RISC-V platforms.
> > It allows code to be executed directly from non-volatile storage
> > directly addressable by the CPU, such as QSPI NOR flash which can
> > be found on many RISC-V platforms. This makes way for significant
> > optimization of RAM footprint. The XIP kernel is not compressed
> > since it has to run directly from flash, so it will occupy more
> > space on the non-volatile storage to The physical flash address
> > used to link the kernel object files and for storing it has to
> > be known at compile time and is represented by a Kconfig option.
> >
> > XIP on RISC-V will currently only work on MMU-enabled kernels.
> >
> > Signed-off-by: Vitaly Wool <vitaly.wool@xxxxxxxxxxxx>
>
> That's nice!
>
> Suggestion for a future enhancement:
> To save on ROM storage, and given that the .data segment has to be
> copied to RAM anyway, you could store .data compressed and decompress it
> to RAM instead. See commit ca8b5d97d6bf for inspiration. In fact, many
> parts there could be shared.

Thanks! That's in my TODO list.

> More comments below.
>
> > +#define __XIP_FIXUP(addr) \
> > + (((long)(addr) >= CONFIG_XIP_PHYS_ADDR && \
> > + (long)(addr) <= CONFIG_XIP_PHYS_ADDR + SZ_16M) ? \
> > + (long)(addr) - CONFIG_XIP_PHYS_ADDR + CONFIG_XIP_PHYS_RAM_BASE - XIP_OFFSET : \
> > + (long)(addr))
>
> Here you should cast to unsigned long instead.

Right, or (just thought of it) uintptr_t for that matter. Does that sound right?

> > +#ifdef CONFIG_XIP_KERNEL
> > + la a0, _trampoline_pg_dir
> > + lw t0, _xip_fixup
> > + add a0, a0, t0
> [...]
> > +_xip_fixup:
> > + .dword CONFIG_XIP_PHYS_RAM_BASE - CONFIG_XIP_PHYS_ADDR - XIP_OFFSET
> > +#endif
>
> Here _xip_fixup is a dword but you're loading it as a word.
> This won't work for both rv32 and rv64.

Well, at this point I believe it does, as long as we use little
endian. 64-bit version has been verified.
I do not argue though that it isn't nice and should be fixed.

> > +SECTIONS
> > +{
> > + /* Beginning of code and text segment */
> > + . = XIP_VIRT_ADDR(CONFIG_XIP_PHYS_ADDR);
> > + _xiprom = .;
> > + _start = .;
> > + HEAD_TEXT_SECTION
> > + INIT_TEXT_SECTION(PAGE_SIZE)
> > + /* we have to discard exit text and such at runtime, not link time */
> > + .exit.text :
> > + {
> > + EXIT_TEXT
> > + }
> > +
> > + .text : {
> > + _text = .;
> > + _stext = .;
> > + TEXT_TEXT
> > + SCHED_TEXT
> > + CPUIDLE_TEXT
> > + LOCK_TEXT
> > + KPROBES_TEXT
> > + ENTRY_TEXT
> > + IRQENTRY_TEXT
> > + SOFTIRQENTRY_TEXT
> > + *(.fixup)
> > + _etext = .;
> > + }
> > + RO_DATA(L1_CACHE_BYTES)
> > + .srodata : {
> > + *(.srodata*)
> > + }
> > + .init.rodata : {
> > + INIT_SETUP(16)
> > + INIT_CALLS
> > + CON_INITCALL
> > + INIT_RAM_FS
> > + }
> > + _exiprom = ALIGN(PAGE_SIZE); /* End of XIP ROM area */
>
> Why do you align this to a page size?

TBH I just cut the corners here and below, did not have to worry about
partial pages and such.


> > +
> > +
> > +/*
> > + * From this point, stuff is considered writable and will be copied to RAM
> > + */
> > + __data_loc = ALIGN(PAGE_SIZE); /* location in file */
>
> Same question here?
>
> > + . = PAGE_OFFSET; /* location in memory */
> > +
> > + _sdata = .; /* Start of data section */
> > + _data = .;
> > + RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
> > + _edata = .;
> > + __start_ro_after_init = .;
> > + .data.ro_after_init : AT(ADDR(.data.ro_after_init) - LOAD_OFFSET) {
> > + *(.data..ro_after_init)
> > + }
> > + __end_ro_after_init = .;
> > +
> > + . = ALIGN(PAGE_SIZE);
>
> And again here?
>
> > +#ifdef CONFIG_XIP_KERNEL
> > +/* called from head.S with MMU off */
> > +asmlinkage void __init __copy_data(void)
> > +{
> > + void *from = (void *)(&_sdata);
> > + void *end = (void *)(&_end);
> > + void *to = (void *)CONFIG_XIP_PHYS_RAM_BASE;
> > + size_t sz = (size_t)(end - from);
> > +
> > + memcpy(to, from, sz);
> > +}
> > +#endif
>
> Where is the stack located when this executes? The stack for the init
> task is typically found within the .data area. At least on ARM it is.
> You don't want to overwrite your stack here.

sp is set to within the .data area later, we rely on the u-boot sp
setting which is outside of the destination area for the image parts
that are copied. I agree that this makes the implementation fragile
and will explicitly set sp in the next path version.

Best regards,
Vitaly