Re: [-mm] warning during suspend [was: suspend race -mm regression]

From: Suresh Siddha
Date: Thu Sep 10 2009 - 20:01:11 EST


On Thu, 2009-09-10 at 13:57 -0700, Andrew Morton wrote:
> On Sat, 05 Sep 2009 22:41:37 +0800
> Xiao Guangrong <ericxiao.gr@xxxxxxxxx> wrote:
>
> > Jiri Slaby ______:
> > > On 09/05/2009 12:36 AM, Jiri Slaby wrote:
> > >> On 09/05/2009 12:30 AM, Jiri Slaby wrote:
> > >>> WARNING: at kernel/smp.c:124
> > >>> __generic_smp_call_function_interrupt+0xfd/0x110()
> > >>> Hardware name: To Be Filled By O.E.M.
> > >>> Modules linked in: nfs lockd auth_rpcgss sunrpc ath5k ath
> > >>> Pid: 3423, comm: pm-suspend Not tainted 2.6.31-rc8-mm1_64 #762
> > >>> Call Trace:
> > >>> [<ffffffff8103fc48>] warn_slowpath_common+0x78/0xb0
> > >>> [<ffffffff8103fc8f>] warn_slowpath_null+0xf/0x20
> > >>> [<ffffffff8106950d>] __generic_smp_call_function_interrupt+0xfd/0x110
> > >>> [<ffffffff8106956a>] hotplug_cfd+0x4a/0xa0
> > >>> [<ffffffff81434e47>] notifier_call_chain+0x47/0x90
> > >>> [<ffffffff8105b311>] raw_notifier_call_chain+0x11/0x20
> > >>> [<ffffffff8141ece0>] _cpu_down+0x150/0x2d0
> > >> It's the CPU_DEAD notifier:
> > >> ffffffff8141ecd0: 48 83 ce 07 or $0x7,%rsi
> > >> ffffffff8141ecd4: 48 c7 c7 08 ff 5d 81 mov
> > >> $0xffffffff815dff08,%rdi
> > >> ffffffff8141ecdb: e8 20 c6 c3 ff callq ffffffff8105b300
> > >> <raw_notifier_call_chain>
> > >> ffffffff8141ece0: 3d 02 80 00 00 cmp $0x8002,%eax
> > >
> > > And it's due to:
> > > generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch
> > >
> >
> > I think it has collision between my patch and below patch:

Xiao, I am not sure if the race that you are trying to fix here indeed
exists. Doesn't the stop machine that we do as part of cpu down address
and avoid the race that you mention? Have you seen any real crashes and
hangs or is it theory?

And if even the race exists (which I don't think) calling the interrupt
handler from the cpu down path looks like a hack.

Can you please elaborate why we need this patch? Then we can think of a
cleaner solution if needed.

> >
> > Commit-ID: 269c861baa2fe7c114c3bc7831292758d29eb336
> > Gitweb: http://git.kernel.org/tip/269c861baa2fe7c114c3bc7831292758d29eb336
> > Author: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> > AuthorDate: Wed, 19 Aug 2009 18:05:35 -0700
> > Committer: H. Peter Anvin <hpa@xxxxxxxxx>
> > CommitDate: Fri, 21 Aug 2009 16:25:43 -0700
> >
> > generic-ipi: Allow cpus not yet online to call smp_call_function with irqs disabled
> >
> > My patch is merged at -mm tree, but this patch is base on -tip tree later, so it has this
> > problem
> >
> > Suresh, what your opinion?
> >
>
> Suresh appears to be hiding.

Not any more. I am back from vacation :(

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/