Re: [PATCH v5 1/7] s390/percpu: Infrastructure for more efficient this_cpu operations

From: David Laight

Date: Tue Jun 02 2026 - 10:49:13 EST


On Tue, 2 Jun 2026 15:54:36 +0200
Heiko Carstens <hca@xxxxxxxxxxxxx> wrote:

> On Tue, Jun 02, 2026 at 02:32:09PM +0100, David Laight wrote:
> > On Mon, 1 Jun 2026 17:08:13 +0200
> > Heiko Carstens <hca@xxxxxxxxxxxxx> wrote:
> > > It is: the check makes sure this is an AG instruction, which adds the
> > > percpu offset from lowcore - by checking that the displacement is
> > > correct, as well as that the base register is zero.
> > >
> > > There could be a different AG instruction within the inline assembly,
> > > for whatever reason.
> >
> > Do you actually even need to check the instruction?
> >
> > This sequence can only work for simple per-cpu accesses, so I don't
> > see a reason to let the specified register point anywhere other than the
> > base of the per-cpu data.
> >
> > That means the process switch code can just load the register with the
> > base of the per-cpu data for the new cpu.
> > If that happens before the 'AG' is executed it won't matter.
> >
> > The only reason would be to support non-offsettable memory accesses.
> > But it looks like the 'laag %r5,%r2,0(%r4)' in the example has an
> > offset (of zero).
> > Probably only stops you doing a direct access of an array.
> >
> > That would mean that needs_fixup goes in the bin and percpu_exit() becomes:
> > ...
> > reg = regs->percpu_register;
> > if (likely(!reg))
> > return;
> > lc->percpu_register = reg;
> > regs->gprs[reg] = lc->percpu_offset
> > }
> >
> > I guess I'm missing something?
>
> The percpu register (in the above example %r4) first contains the base address
> of a percpu variable. To get the actual percpu address of the variable the
> percpu_offset of the corresponding cpu has to be added to that address, which
> is what the AG instruction is doing.

I knew my brain was fading.
I'm sure it should be possible to get the linker to put the offset of
the variable from the base on the per-cpu data area into the laag instruction.
(Looks like it has a 20bit offset field.)

Although I've no idea how per-cpu data works in loadable modules.

That might mean you need lc->percpu_address as well as lc->percpu_offset.
(If it isn't there already.)

-- David

>
> What you propose would make a CPU's percpu_offset the address of any percpu
> variable, which most likely would result in a crash.