Re: invalidate caches before going into suspend

From: Mark Langsdorf
Date: Wed Aug 13 2008 - 13:05:12 EST


On Wednesday 13 August 2008, Ingo Molnar wrote:
>
> * Mark Langsdorf <mark.langsdorf@xxxxxxx> wrote:
>
> > When a CPU core is shut down, all of its caches need to be flushed to
> > prevent stale data from causing errors if the core is resumed. Current
> > Linux suspend code performs an assignment after the flush, which can
> > add dirty data back to the cache. On some AMD platforms, additional
> > speculative reads have caused crashes on resume because of this dirty
> > data.
> >
> > Relocate the cache flush to be the very last thing done before
> > halting.
>
> nice catch! Applied to x86/urgent.
>
> I'm really curious: how did you find this bug? Did you see a CPU come up
> as !CPU_DEAD?

AMD's diagnostic code for new CPUs was hanging when coming out of suspend,
so I presume it was hitting a bug check for not !CPU_DEAD. I got the
debug lab reports second hand. They traced the root cause to dirty data
being preserved in the cache and suggested relocating the wbinvd().

> please send a patch for the 32-bit side too, it has the same bug.
>
> also, we might be safer if the wbinvd(), the CLI and the halt was in a
> single assembly sequence:

> to make sure the compiler doesnt ever insert something into this
> codepath? [ And note the double cli which would be further
> robustification - in theory we could get a spurious interrupt straight
> after the wbinvd. ] Hm?

I don't think it's necessary. I can submit a delta patch later if you
think it's really necessary.


Signed-off-by: Mark Langsdorf <mark.langsdorf@xxxxxxx>

diff -r 1e74a821dd00 arch/x86/kernel/process_32.c
--- a/arch/x86/kernel/process_32.c Tue Aug 12 12:04:12 2008 -0500
+++ b/arch/x86/kernel/process_32.c Wed Aug 13 06:40:00 2008 -0500
@@ -95,11 +95,11 @@ static inline void play_dead(void)
{
/* This must be done before dead CPU ack */
cpu_exit_clear();
- wbinvd();
mb();
/* Ack it */
__get_cpu_var(cpu_state) = CPU_DEAD;

+ wbinvd();
/*
* With physical CPU hotplug, we should halt the cpu
*/




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/