Re: [PATCH] i386: do a global tlb flush in S4 resume

From: Rafael J. Wysocki
Date: Thu Mar 04 2010 - 14:47:15 EST


On Thursday 04 March 2010, Shaohua Li wrote:
> On Thu, Mar 04, 2010 at 10:30:02AM +0800, H. Peter Anvin wrote:
> > On 03/03/2010 05:23 PM, Shaohua Li wrote:
> > > Colin reported a strange oops in S4 resume code path (see below). The test
> > > system has i5/i7 CPU. The kernel doesn't open PAE, so 4M page table is used.
> > > The oops always happen a virtual address 0xc03ff000, which is mapped to the
> > > last 4k of first 4M memory. Doing a global tlb flush fixes the issue.
> > >
> > > EIP: 0060:[<c0493a01>] EFLAGS: 00010086 CPU: 0
> > > EIP is at copy_loop+0xe/0x15
> > > EAX: 36aeb000 EBX: 00000000 ECX: 00000400 EDX: f55ad46c
> > > ESI: 0f800000 EDI: c03ff000 EBP: f67fbec4 ESP: f67fbea8
> > > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > > ...
> > > ...
> > > CR2: 00000000c03ff000
> > >
> > > Tested-by: Colin Ian King <colin.king@xxxxxxxxxxxxx>
> > > Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx>
> > > ---
> > > arch/x86/power/hibernate_asm_32.S | 11 +++++++++++
> > > 1 files changed, 11 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/arch/x86/power/hibernate_asm_32.S b/arch/x86/power/hibernate_asm_32.S
> > > index b641388..9e4ef64 100644
> > > --- a/arch/x86/power/hibernate_asm_32.S
> > > +++ b/arch/x86/power/hibernate_asm_32.S
> > > @@ -27,10 +27,21 @@ ENTRY(swsusp_arch_suspend)
> > > ret
> > >
> > > ENTRY(restore_image)
> > > + movl mmu_cr4_features, %ecx
> > > movl resume_pg_dir, %eax
> > > subl $__PAGE_OFFSET, %eax
> > > movl %eax, %cr3
> > >
> > > + jecxz 1f # cr4 Pentium and higher, skip if zero
> > > + movl %ecx, %edx
> > > + andl $~(X86_CR4_PGE), %edx
> > > + movl %edx, %cr4; # turn off PGE
> > > +1:
> > > + movl %cr3, %eax; # flush TLB
> > > + movl %eax, %cr3
> > > + jecxz 1f # cr4 Pentium and higher, skip if zero
> > > + movl %ecx, %cr4; # turn PGE back on
> > > +1:
> > > movl restore_pblist, %edx
> > > .p2align 4,,7
> > >
> >
> > Since we're about to do another global page flush a bit further down in
> > the same code, why not just leave PGE off until then?
> sure, updated patch.
>
>
> i386: do a global tlb flush in S4 resume
>
> Colin reported a strange oops in S4 resume code path (see below). The test
> system has i5/i7 CPU. The kernel doesn't open PAE, so 4M page table is used.
> The oops always happen a virtual address 0xc03ff000, which is mapped to the
> last 4k of first 4M memory. Doing a global tlb flush fixes the issue.
>
> EIP: 0060:[<c0493a01>] EFLAGS: 00010086 CPU: 0
> EIP is at copy_loop+0xe/0x15
> EAX: 36aeb000 EBX: 00000000 ECX: 00000400 EDX: f55ad46c
> ESI: 0f800000 EDI: c03ff000 EBP: f67fbec4 ESP: f67fbea8
> DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> ...
> ...
> CR2: 00000000c03ff000
>
> Tested-by: Colin Ian King <colin.king@xxxxxxxxxxxxx>
> Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx>
>
> diff --git a/arch/x86/power/hibernate_asm_32.S b/arch/x86/power/hibernate_asm_32.S
> index b641388..cd5e878 100644
> --- a/arch/x86/power/hibernate_asm_32.S
> +++ b/arch/x86/power/hibernate_asm_32.S
> @@ -27,10 +27,17 @@ ENTRY(swsusp_arch_suspend)
> ret
>
> ENTRY(restore_image)
> + movl mmu_cr4_features, %ecx
> movl resume_pg_dir, %eax
> subl $__PAGE_OFFSET, %eax
> movl %eax, %cr3
>
> + jecxz 1f # cr4 Pentium and higher, skip if zero
> + andl $~(X86_CR4_PGE), %ecx
> + movl %ecx, %cr4; # turn off PGE
> + movl %cr3, %eax; # flush TLB
> + movl %eax, %cr3
> +1:
> movl restore_pblist, %edx
> .p2align 4,,7

In that case please also remove the turning GPE off down the road.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/