Re: [bisected] "sched: Allow per-cpu kernel threads to run on online && !active" causes warning
From: Michael Holzheu
Date: Wed Aug 17 2016 - 05:21:05 EST
Am Wed, 17 Aug 2016 00:19:53 +0200
schrieb Heiko Carstens <heiko.carstens@xxxxxxxxxx>:
> On Tue, Aug 16, 2016 at 11:42:05AM -0400, Tejun Heo wrote:
> > Hello, Peter.
> >
> > On Tue, Aug 16, 2016 at 05:29:49PM +0200, Peter Zijlstra wrote:
> > > On Tue, Aug 16, 2016 at 11:20:27AM -0400, Tejun Heo wrote:
> > > > As long as the mapping doesn't change after the first onlining
> > > > of the CPU, the workqueue side shouldn't be too difficult to
> > > > fix up. I'll look into it. For memory allocations, as long as
> > > > the cpu <-> node mapping is established before any memory
> > > > allocation for the cpu takes place, it should be fine too, I
> > > > think.
> > >
> > > Don't we allocate per-cpu memory for 'cpu_possible_map' on boot?
> > > There's a whole bunch of per-cpu memory users that does things
> > > like:
> > >
> > >
> > > for_each_possible_cpu(cpu) {
> > > struct foo *foo = per_cpu_ptr(&per_cpu_var, cpu);
> > >
> > > /* muck with foo */
> > > }
> > >
> > >
> > > Which requires a cpu->node map for all possible cpus at boot time.
> >
> > Ah, right. If cpu -> node mapping is dynamic, there isn't much that
> > we can do about allocating per-cpu memory on the wrong node. And it
> > is problematic that percpu allocations can race against an onlining
> > CPU switching its node association.
> >
> > One way to keep the mapping stable would be reserving per-node
> > possible CPU slots so that the CPU number assigned to a new CPU is
> > on the right node. It'd be a simple solution but would get really
> > expensive with increasing number of nodes.
> >
> > Heiko, do you have any ideas?
>
> I think the easiest solution would be to simply assign all cpus, for
> which we do not have any topology information, to an arbitrary node;
> e.g. round robin.
>
> After all the case that cpus are added later is rare and the s390
> fake numa implementation does not know about the memory topology. All
> it is doing is distributing the memory to several nodes in order to
> avoid a single huge node. So that should be sort of ok.
>
> Unless somebody has a better idea?
>
> Michael, Martin?
If it is really required that cpu_to_node() can be called for
all possible cpus this sounds like a reasonable workaround to me.
Michael