Re: [PATCH 4/5] powernv:idle: Move initialization of sibling pacas to pnv_alloc_idle_core_states

From: Nicholas Piggin
Date: Sat Jul 08 2017 - 05:00:50 EST


On Fri, 7 Jul 2017 20:34:16 +0530
Gautham R Shenoy <ego@xxxxxxxxxxxxxxxxxx> wrote:

> On Fri, Jul 07, 2017 at 01:16:09AM +1000, Nicholas Piggin wrote:
> > On Wed, 5 Jul 2017 22:08:15 +0530
> > "Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > From: "Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx>
> > >
> > > On POWER9 DD1, in order to get around a hardware issue, we store in
> > > every CPU thread's paca the paca pointers of all its siblings.
> > >
> > > Move this code into pnv_alloc_idle_core_states() soon after the space
> > > for saving the sibling pacas is allocated.
> > >
> > > Signed-off-by: Gautham R. Shenoy <ego@xxxxxxxxxxxxxxxxxx>
> >
> > > - if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
> > > - int cpu;
> > > -
> > > - pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n");
> > > - for_each_possible_cpu(cpu) {
> > > - int base_cpu = cpu_first_thread_sibling(cpu);
> > > - int idx = cpu_thread_in_core(cpu);
> > > - int i;
> > > -
> >
> > You could move the thread_sibling_pacas allocation to here?
> >
> > Speaking of which... core_idle_state and thread_sibling_pacas are
> > allocated with kmalloc_node... What happens if we take an SLB miss
> > in the idle wakeup code on these guys? Nothing good I think. Perhaps
> > we should put them into the pacas or somewhere in bolted memory.
>
> Yes, though the SLB miss hasn't yet been encountered in practise so
> far!

Considering it's a node-affine allocation, it may actually be possible
to hit in practice on very large memory systems in practice.

> While one can define thread_sibling_pacas in PACA, it doesn't make
> sense to allocate space for core_idle_state in PACA since the
> allocated value of the secondary threads will never be used.

Well, same for core_idle_state, although that's smaller.

> What is the right way to ensure that these allocations fall in the
> bolted range ?

I'm not sure, I guess the memblock allocator is not up anymore at this
point. I think we'd have to move it earlier. You could allocate another
array of them along side the paca allocation.

Thanks,
Nick