Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask

From: Nick Piggin
Date: Mon Sep 08 2008 - 06:31:14 EST


On Sunday 07 September 2008 04:12, Mike Travis wrote:
> Ingo Molnar wrote:
> > * Mike Travis <travis@xxxxxxx> wrote:
> >> * Cleanup cpumask_t usages in smp_call_function_mask function chain
> >> to prevent stack overflow problem when NR_CPUS=4096.
> >>
> >> * Reduce the number of passed cpumask_t variables in the following
> >> call chain for x86_64:
> >>
> >> smp_call_function_mask -->
> >> arch_send_call_function_ipi->
> >> smp_ops.send_call_func_ipi -->
> >> genapic->send_IPI_mask
> >>
> >> Since the smp_call_function_mask() is an EXPORTED function, we
> >> cannot change it's calling interface for a patch to 2.6.27.
> >>
> >> The smp_ops.send_call_func_ipi interface is internal only and
> >> has two arch provided functions:
> >>
> >> arch/x86/kernel/smp.c: .send_call_func_ipi = native_send_call_func_ipi
> >> arch/x86/xen/smp.c: .send_call_func_ipi =
> >> xen_smp_send_call_function_ipi arch/x86/mach-voyager/voyager_smp.c:
> >> (uses native_send_call_func_ipi)
> >>
> >> Therefore modifying the internal interface to use a cpumask_t
> >> pointer is straight-forward.
> >>
> >> The changes to genapic are much more extensive and are affected by
> >> the recent additions of the x2apic modes, so they will be done for
> >> 2.6.28 only.
> >>
> >> Based on 2.6.27-rc5-git6.
> >>
> >> Applies to linux-2.6.tip/master (with FUZZ).
> >
> > applied to tip/cpus4096, thanks Mike.
>
> Thanks Ingo! Could you send me the git id for the merge?
>
> > I'm still wondering whether we should get rid of non-reference based
> > cpumask_t altogether ...
>
> I've got a whole slew of "get-ready-to-remove-cpumask_t's" coming soon.
> There are two phases, one completely within the x86 arch and the 2nd hits
> the generic smp_call_function_mask ABI (won't be doable as a back-ported
> patch to 2.6.27.)
>
> > Did you have a chance to look at the ftrace/stacktrace tracer in latest
> > tip/master, which will show the maximum stack footprint that can occur?
>
> Hmm, no. I'm using a default config right now as I can boot that pretty
> easily. I'll turn on the ftrace thing and check it out.
>
> > Also, i've applied the patch below as well to restore MAXSMP in a muted
> > form - with big warning signs added as well.
>
> The main thing is to allow the distros to set it manually for their QA
> testing of 2.6.27. I'm sure I'll get back bugs because of just that.
>
> (Is there a way to have them know to assign bugzilla's to me if NR_CPUS=4k
> is the root of the problem? This is an extremely serious issue for SGI
> and I'd like to avoid any delays in me finding out about problems.)

Considering that, unless I'm mistaken, you want to run production systems
with 4096 CPUs at some point, then I would say you should really consider
increasing NR_CPUS _further_ than that in QA efforts, so that we might be
a bit more confident of running production kernels with 4096.

Is that being tried? Setting it to 8192 or even higher during QA seems
like a good idea to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/