Re: [PATCH] asm-generic: replace ________addr with __UNIQUE_ID(addr)

From: David Laight

Date: Sun Apr 26 2026 - 13:34:28 EST


On Sun, 26 Apr 2026 04:09:25 -0700
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Sun, 26 Apr 2026 11:49:38 +0100 David Laight <david.laight.linux@xxxxxxxxx> wrote:
>
> > On Sat, 25 Apr 2026 15:12:40 -0700
> > Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Sat, 25 Apr 2026 23:01:34 +0100 David Laight <david.laight.linux@xxxxxxxxx> wrote:
> > >
> > > > > > The real problem with this define is that both idx and phys are
> > > > > > expanded twice.
> > > > >
> > > > > The real problem with this define is that it's a define. Why oh why do
> > > > > we keep doing this to ourselves?
> > > >
> > > > Sometimes #defines generate better code because they are expanded earlier,
> > > > and sometimes you want type-agnostic 'functions'.
> > > > But neither is true here.
> > > >
> > > > But I think I'd go for 'always_inline'.
> > > > Sometimes the compilers make silly decisions.
> > >
> > > Gee, if `static inline' misbehaves then we have big problems!
> > >
> > > What's special about the fixmap code anyway? It's not exactly
> > > fastpath. Perhaps this stuff can simply be uninlined.
> >
> > Some of the inlines are trivial - just adding an extra parameter.
> > But this set would be simpler if the last function
> > (__native_set_fixmap() for x86) returned (address & ~PAGE_MASK)
>
> I don't understand this?

I found the following:

#define set_fixmap_offset(idx, phys) \
__set_fixmap_offset(idx, phys, FIXMAP_PAGE_NORMAL)

#define __set_fixmap_offset(idx, phys, flags) \
({ \
unsigned long ________addr; \
__set_fixmap(idx, phys, flags); \
________addr = fix_to_virt(idx) + ((phys) & (PAGE_SIZE - 1)); \
________addr; \
})

static inline void __set_fixmap(enum fixed_addresses idx,
phys_addr_t phys, pgprot_t flags)
{
native_set_fixmap(idx, phys, flags);
}

void native_set_fixmap(unsigned /* enum fixed_addresses */ idx,
phys_addr_t phys, pgprot_t flags)
{
/* Sanitize 'prot' against any unsupported bits: */
pgprot_val(flags) &= __default_kernel_pte_mask;

__native_set_fixmap(idx, pfn_pte(phys >> PAGE_SHIFT, flags));
}

void __native_set_fixmap(enum fixed_addresses idx, pte_t pte)
{
unsigned long address = __fix_to_virt(idx);

#ifdef CONFIG_X86_64
/*
* Ensure that the static initial page tables are covering the
* fixmap completely.
*/
BUILD_BUG_ON(__end_of_permanent_fixed_addresses >
(FIXMAP_PMD_NUM * PTRS_PER_PTE));
#endif

if (idx >= __end_of_fixed_addresses) {
BUG();
return;
}
set_pte_vaddr(address, pte);
fixmaps_set++;
}

Doesn't seem to hard to arrange to return 'address + (phys & ~PAGE_SHIFT)'
from __set_fixmap().
(Other architectures will vary, but I suspect they are similar.)


...
> > > We inline too much stuff.
> >
> > True - don't look at what strlcpy() can generate.
> > The inline code should just get the constants from the compiler and
> > then call the appropriate function.
>
> I can't actually find an in-kernel strlcpy()?

That is because I meant strscpy() :-)

In particular https://elixir.bootlin.com/linux/v7.0/source/include/linux/fortify-string.h#L275
Particularly once it gets as far as calling strnlen().

>
> > As does the compiler.
> > pixpaper_panel_hw_init() repeatedly calls two static functions that
> > contain sleeps. They all get inlined bloating the code size and
> > exploding the stack when clang separately allocates all the buffers
> > in the called functions.
>
> Poor thing. I guess compiler developers are more focused on userspace
> code and aren't especially concerned about kernel's particular
> requirements. And that's understandable - it's down to us to persuade
> them to add options to permit us to tune the compiler behavior.

That would be wrong for for userspace as well.
Inlining and loop unrolling decisions don't seem to allow
for the cost of reading code from memory into the I-cache.
(Loop unrolling is so 1980s)

My suspicion is that very few writes of user code ever look at
the generated code.
So the compiler developers go their own way optimising things
that improve some benchmarks but make more normal programs
run a bit slower.
Of course, modern cpu are fast enough that it doesn't matter
what is generated for most code.
(end rant)

David