Re: [RESEND][PATCH] powerpc/powernv : Save/Restore SPRG3 on entry/exit from stop.
From: Michael Neuling
Date: Fri Jul 20 2018 - 03:09:19 EST
On Fri, 2018-07-20 at 16:29 +1000, Michael Ellerman wrote:
> Michael Neuling <mikey@xxxxxxxxxxx> writes:
> > On Fri, 2018-07-20 at 12:32 +1000, Michael Ellerman wrote:
> > > Michael Neuling <mikey@xxxxxxxxxxx> writes:
> > > > On Wed, 2018-07-18 at 13:42 +0530, Gautham R Shenoy wrote:
> > > > > On Wed, Jul 18, 2018 at 09:24:19AM +1000, Michael Neuling wrote:
> > > > > >
> > > > > > > DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
> > > > > > > diff --git a/arch/powerpc/kernel/idle_book3s.S
> > > > > > > b/arch/powerpc/kernel/idle_book3s.S
> > > > > > > index d85d551..5069d42 100644
> > > > > > > --- a/arch/powerpc/kernel/idle_book3s.S
> > > > > > > +++ b/arch/powerpc/kernel/idle_book3s.S
> > > > > > > @@ -120,6 +120,9 @@ power9_save_additional_sprs:
> > > > > > > mfspr r4, SPRN_MMCR2
> > > > > > > std r3, STOP_MMCR1(r13)
> > > > > > > std r4, STOP_MMCR2(r13)
> > > > > > > +
> > > > > > > + mfspr r3, SPRN_SPRG3
> > > > > > > + std r3, STOP_SPRG3(r13)
> > > > > >
> > > > > > We don't need to save it. Just restore it from paca->sprg_vdso
> > > > > > which
> > > > > > should
> > > > > > never change.
> > > > >
> > > > > Ok. I will respin a patch to restore SPRG3 from paca->sprg_vdso.
> > > > >
> > > > > >
> > > > > > How can we do better at catching these missing SPRGs?
> > > > >
> > > > > We can go through the list of SPRs from the POWER9 User Manual and
> > > > > document explicitly why we don't have to save/restore certain SPRs
> > > > > during the execution of the stop instruction. Does this sound ok ?
> > > > >
> > > > > (Ref: Table 4-8, Section 4.7.3.4 from the POWER9 User Manual
> > > > > accessible from
> > > > > https://openpowerfoundation.org/?resource_lib=power9-processor-users-m
> > > > > anua
> > > > > l)
> > > >
> > > > I was thinking of a boot time test case built into linux. linux has some
> > > > boot
> > > > time test cases which you can enable via CONFIG options.
> > > >
> > > > Firstly you could see if an SPR exists using the same trick xmon does in
> > > > dump_one_spr(). Then once you have a list of usable SPRs, you could
> > > > write
> > > > all
> > > > the known ones (I assume you'd have to leave out some, like the PSSCR),
> > > > then
> > > > set
> > >
> > > Write what value?
> > >
> > > Ideally you want to write a random bit pattern to reduce the chance
> > > that only some bits are being restored.
> >
> > The xmon dump_one_spr() trick tries to work around that by writing one
> > random
> > value and then a different one to see if it really is a nop.
> >
> > > But you can't do that because writing a value to an SPRs has an effect.
> >
> > Sure that's a concern but xmon seems to get away with it.
>
> I don't think it writes, but maybe I'm reading the code wrong.
You're right, sorry. It's the write the GPR that becomes a NOP when the SPR is
not there. I misremembered how it worked.
Maybe that won't work stop since we'd need to be able change the SPR value to
ensure we don't hit the reset value after a stop state.
We'd be able to detect SPRs that that change from it's reset value but not those
that are already at their reset value.
> Writing a random value to the MSR could be fun :)
Fortunately the MSR is not an SPR :-P
> >
> > Yeah, I'm not convinced it'll work either but it would be a nice piece of
> > test
> > infrastructure to have if it does work.
>
> Yeah I guess I'd rather we worked on 1) and 2) below first :)
ok
> > We'd still need to marry up the SPR numbers we get from the test to what's
> > actually being restored in Linux.
> >
> > > But there's a much simpler solution, we should 1) have a selftest for
> > > getcpu() and 2) we should be running the glibc (I think?) test suite
> > > that found this in the first place. It's frankly embarrassing that we
> > > didn't find this.
> >
> > Yeah, we should do that also, but how do we catch the next SPR we are
> > missing.
> > I'd like some systematic way of doing that rather than wack-a-mole.
>
> Whack-a-mole ðððð
I preferred waking them :-)
> We could also improve things by documenting how each SPR is handled, eg.
> is it saved/restored across idle, syscall, KVM etc. And possibly that
> could even become code that defines how SPRs are handled, rather than it
> all being done ad-hoc.
Yeah. It's complicated by linux calling opal_slw_set_reg() to change what's
saved. This was part of the reason I'd hoped doing a linux test case would help
as we could do it after those calls.
Mikey