Re: [GIT PULL] bitmap changes for v6.0-rc1

From: Yury Norov
Date: Mon Oct 10 2022 - 23:10:09 EST


On Mon, Oct 10, 2022 at 02:18:46PM -0700, Linus Torvalds wrote:
> On Mon, Oct 10, 2022 at 2:12 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Sun, Oct 2, 2022 at 7:03 PM Yury Norov <yury.norov@xxxxxxxxx> wrote:
> > >
> > > lib/cpumask: add FORCE_NR_CPUS config option
> >
> > Side note: I think we should make this option depend on EXPERT.

OK. I'll send a patch.

> Oh, and if a constant 'nr_cpu_ids' is such an important thing from a
> code generation standpoint, maybe we'd be better off saying 'for small
> NR_CPUS, just make nr_cpu_ids be that small constant".

>From what I see, 'FORCE' effects on size 3x more for bigger NR_CPU_IDS.
For x86_64 allmodconfig build (w/o MAXSMP) on top of today's master:

NR_CPUS Diff Absolute values
FORCE=n FORCE=y
8192 -222429 373299266 373076837
256 -221799 363679392 363457593
4 -88922 363387049 363298127

> Rather than confusing people with a question that is a bit strange.
> Afaik, all our kernel code has to work with "nr_cpu_id's doesn't match
> the number of CPU's we actually have", because of cpu hotplug (and
> suspend/resume) issues anyway.

nr_cpu_ids is a number of possible cpus. Once set, it's never changed.
What system has now is online cpus. FORCE_NR_CPU_IDS doesn't break
hotplug because it only controls when we set nr_cpu_ids - at boot time
or compile time.

> So I _think_ all the nr_cpu_id's games are mainly because we don't
> want to allocate huge arrays when you can't actually have that many
> CPU's - but for a small NR_CPUS situation that just isn't an issue
> anyway.
>
> Yes? No?

You're talking about .data section. FORCE_NR_CPUS is about .text and
code generation.

Compile-time allocations for cpumasks, per-cpu variables etc. shouldn't
change when we build kernel with FORCE_NR_CPUS on and off against the
same NR_CPUS, right?

I originally expected that 'FORCE' will improve code generation for
NR_CPUS < BITS_PER_LONG because it allows small_const_nbits()
optimization in cpumasks. But with big NR_CPUS it works even better.
I don't know exactly why. nr_cpu_ids is used widely beyond cpumasks,
after all.

Thanks,
Yury