Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine

From: Rafael J. Wysocki
Date: Mon Nov 10 2008 - 17:50:58 EST


On Monday, 10 of November 2008, Rafael J. Wysocki wrote:
> On Monday, 10 of November 2008, Heiko Carstens wrote:
> > On Sun, Nov 09, 2008 at 06:59:16PM +0100, Rafael J. Wysocki wrote:
> > > This message has been generated automatically as a part of a report
> > > of recent regressions.
> > >
> > > The following bug entry is on the current list of known regressions
> > > from 2.6.27. Please verify if it still should be listed and let me know
> > > (either way).
> > >
> > >
> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11989
> > > Subject : Suspend failure on NForce4-based boards due to chanes in stop_machine
> > > Submitter : Rafael J. Wysocki <rjw@xxxxxxx>
> > > Date : 2008-11-03 0:28 (7 days old)
> > > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
> > > References : http://marc.info/?l=linux-kernel&m=122567187604356&w=4
> >
> > Hi Rafael,
>
> Hi,
>
> > could you provide more informations for this, please?
> >
> > What is your kernel configuration?
>
> Available at: http://www.sisk.pl/kernel/debug/mainline/2.6.28-rc3/kitty-config
>
> > Do you have any binary only modules (nvidia?) loaded?
>
> No, I don't.
>
> > Is it possible to recreate the bug by e.g. just doing something like
> >
> > echo 0 > /sys/devices/system/cpu/cpu1/online
>
> I haven't checked (yet), I'll do that later today and let you know.
>
> > (or any other online cpu)? Or does it trigger any lockdep warnings?

It cannot be reproduced with offlining CPU1 and it doesn't trigger any
warnings from lockdep.

However, it is reproducible by doing

# echo core > /sys/power/pm_test

and repeating

# echo disk > /sys/power/state

for a couple of times, in which case the last two lines printed to the console
before a (solid) hang are:

SMP alternatives: switching to SMP code
Booting processor 1 APIC 0x1 ip 0x6000

So, it evidently fails while re-enabling the non-boot CPU and not during
disabling it as I thought before.

With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the issue is
not reproducible any more.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/