Re: Hibernate resume bug around 3,18-rc2 - Full PAT support

From: Juergen Gross
Date: Thu Nov 19 2015 - 00:39:36 EST


On 18/11/15 22:43, Vassilis Virvilis wrote:
> Hi,
>
> I have been hit by a hibernate/resume bug. Other people may have too:
> The following links are consistent with my observations
>
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1490494
> https://bugs.archlinux.org/task/44807
>
> Some observations:
> 1) The first few rapid hibernation / resume cycles do not fail.
>
> 2) If the computer is loaded (eclipse + chromium + firefox/iceweasel +
> thunderbird/icedove + Konsole) helps to reproduce and lock up during resume
>
> 3) Long hibernation times (overnight) helps to reproduce and lock up
> during resume
>
> 4) For the bad commits (where the lockup during resume takes place) -
> the image loading during resume is significantly faster. It is fast and
> then it locks.
>
> How I hit the problem and what I have done:
>
> I am running debian unstable
>
> Debian went from 3.16 to 3.19 - hence the problem raised its ugly head.
> I upgraded diligently up to 4.2.6 - The problem persists

Could you please try the most recent 4.3 kernel? There has been some
work related to this topic after 4.2 (large page pat handling done by
Toshi Kani and mtrr/pat handling by Luis Rodriguez).

Another interesting information would be the exact hardware you are
using. Maybe we can see some similarities between yours and the other
two cases you referenced above.

> I added no_console_suspend initcall_debug to the kernel command line -
> see attached image of the lockup.
>
> I added the drm.debug=0xe but it didn't produce any interesting (ok I
> know who I am to judge?) and the runs did not have it so I took it out
> again.
>
> I reproduced with hibernating and resuming back to KDE and or back to
> text console.
>
> I switched to the VGA console and the resume problem persists.
>
> I started kernel bisection from 3.16 to 3.19 following
> https://wiki.debian.org/DebianKernel/GitBisect
>
> One month and 25 kernels later see below for the bisect log

Wow! Thanks for doing this work!


Juergen

>
> I hit some untestable kernel that weren't booting. They were hanging at
> "Loading ramdisk..." before any actual kernel message.
>
> Looks like the first bad / untestable commit is from Juergen Gross /
> Thomas Gleixner Merge branch 'x86-mm-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip [full PAT support]
>
> Full disclaimer: I may have fucked up the bisection. Finding bad commits
> was semi easy - finding good commits needs a run time for 2-3 days.
>
> I would really appreciate some help and directions to nail this down.
>
>
> Regards
>
> Vassilis Virvilis
>
>
>
> bill@localhost:~/Downloads/linux$ git bisect log
> git bisect start
> # good: [19583ca584d6f574384e17fe7613dfaeadcdc4a6] Linux 3.16
> git bisect good 19583ca584d6f574384e17fe7613dfaeadcdc4a6
> # bad: [bfa76d49576599a4b9f9b7a71f23d73d6dcff735] Linux 3.19
> git bisect bad bfa76d49576599a4b9f9b7a71f23d73d6dcff735
> # good: [754c780953397dd5ee5191b7b3ca67e09088ce7a] Merge branch
> 'for-v3.18' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
> git bisect good 754c780953397dd5ee5191b7b3ca67e09088ce7a
> # bad: [7ef58b32f571bffb7763c6252ad7527562081f34] Merge tag
> 'devicetree-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux
> git bisect bad 7ef58b32f571bffb7763c6252ad7527562081f34
> # good: [53429290a054b30e4683297409fc4627b2592315] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
> git bisect good 53429290a054b30e4683297409fc4627b2592315
> # good: [3a647c1d7ab08145cee4b650f5e797d168846c51] Merge tag
> 'drivers-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
> git bisect good 3a647c1d7ab08145cee4b650f5e797d168846c51
> # bad: [1366f5d3129f2abde606214de7afc3dd61781fa3] Merge branch
> 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
> git bisect bad 1366f5d3129f2abde606214de7afc3dd61781fa3
> # good: [151cd97630f87451cab412e40750d0e5f7581c98] Merge tag
> 'defconfig-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
> git bisect good 151cd97630f87451cab412e40750d0e5f7581c98
> # good: [ecb50f0afd35a51ef487e8a54b976052eb03d729] Merge branch
> 'irq-core-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good ecb50f0afd35a51ef487e8a54b976052eb03d729
> # bad: [3a5dc1fafb016560315fe45bb4ef8bde259dd1bc] Merge branch
> 'x86-microcode-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad 3a5dc1fafb016560315fe45bb4ef8bde259dd1bc
> # good: [b6444bd0a18eb47343e16749ce80a6ebd521f124] Merge branch
> 'x86-boot-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good b6444bd0a18eb47343e16749ce80a6ebd521f124
> # bad: [a023748d53c10850650fe86b1c4a7d421d576451] Merge branch
> 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad a023748d53c10850650fe86b1c4a7d421d576451
> # good: [773fed910d41e443e495a6bfa9ab1c2b7b13e012] Merge branches
> 'x86-platform-for-linus' and 'x86-uv-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good 773fed910d41e443e495a6bfa9ab1c2b7b13e012
> # good: [49a3b3cbdf1621678a39bd95a3e67c0f858539c7] x86: Use new cache
> mode type in mm/iomap_32.c
> git bisect good 49a3b3cbdf1621678a39bd95a3e67c0f858539c7
> # skip: [87ad0b713b1034b6caf559976c35ce47f6d1d1e9] x86: Clean up
> pgtable_types.h
> git bisect skip 87ad0b713b1034b6caf559976c35ce47f6d1d1e9
> # skip: [c06814d8419a74528500f85faf5fc01f67f8e7e6] x86: Use new cache
> mode type in setting page attributes
> git bisect skip c06814d8419a74528500f85faf5fc01f67f8e7e6
> # skip: [e00c8cc93c1ac01ecd5049929a50fb47b62bb041] x86: Use new cache
> mode type in memtype related functions
> git bisect skip e00c8cc93c1ac01ecd5049929a50fb47b62bb041
> # skip: [bd809af16e3ab1f8d55b3e2928c47c67e2a865d2] x86: Enable PAT to
> use cache mode translation tables
> git bisect skip bd809af16e3ab1f8d55b3e2928c47c67e2a865d2
> # skip: [f439c429c320981943f8b64b2a4049d946cb492b] x86: Support PAT bit
> in pagetable dump for lower levels
> git bisect skip f439c429c320981943f8b64b2a4049d946cb492b
> # skip: [47591df505129c9774af6cca2debf283a6e56ed7] xen: Support Xen
> pv-domains using PAT
> git bisect skip 47591df505129c9774af6cca2debf283a6e56ed7
> # skip: [b14097bd911c2554b0b5271b3a6b2d84044d1843] x86: Use new cache
> mode type in mm/ioremap.c
> git bisect skip b14097bd911c2554b0b5271b3a6b2d84044d1843
> # skip: [102e19e1955d85f31475416b1ee22980c6462cf8] x86: Remove looking
> for setting of _PAGE_PAT_LARGE in pageattr.c
> git bisect skip 102e19e1955d85f31475416b1ee22980c6462cf8
> # skip: [f5b2831d654167d77da8afbef4d2584897b12d0c] x86: Respect PAT bit
> when copying pte values between large and normal pages
> git bisect skip f5b2831d654167d77da8afbef4d2584897b12d0c
> # skip: [0dbcae884779fdf7e2239a97ac7488877f0693d9] x86: mm: Move PAT
> only functions to mm/pat.c
> git bisect skip 0dbcae884779fdf7e2239a97ac7488877f0693d9
> # skip: [2a3746984c98b17b565e6a2c2bbaaaef757db1b4] x86: Use new cache
> mode type in track_pfn_remap() and track_pfn_insert()
> git bisect skip 2a3746984c98b17b565e6a2c2bbaaaef757db1b4
> # only skipped commits left to test
> # possible first bad commit: [a023748d53c10850650fe86b1c4a7d421d576451]
> Merge branch 'x86-mm-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> # possible first bad commit: [0dbcae884779fdf7e2239a97ac7488877f0693d9]
> x86: mm: Move PAT only functions to mm/pat.c
> # possible first bad commit: [47591df505129c9774af6cca2debf283a6e56ed7]
> xen: Support Xen pv-domains using PAT
> # possible first bad commit: [bd809af16e3ab1f8d55b3e2928c47c67e2a865d2]
> x86: Enable PAT to use cache mode translation tables
> # possible first bad commit: [f5b2831d654167d77da8afbef4d2584897b12d0c]
> x86: Respect PAT bit when copying pte values between large and normal pages
> # possible first bad commit: [f439c429c320981943f8b64b2a4049d946cb492b]
> x86: Support PAT bit in pagetable dump for lower levels
> # possible first bad commit: [87ad0b713b1034b6caf559976c35ce47f6d1d1e9]
> x86: Clean up pgtable_types.h
> # possible first bad commit: [e00c8cc93c1ac01ecd5049929a50fb47b62bb041]
> x86: Use new cache mode type in memtype related functions
> # possible first bad commit: [b14097bd911c2554b0b5271b3a6b2d84044d1843]
> x86: Use new cache mode type in mm/ioremap.c
> # possible first bad commit: [c06814d8419a74528500f85faf5fc01f67f8e7e6]
> x86: Use new cache mode type in setting page attributes
> # possible first bad commit: [102e19e1955d85f31475416b1ee22980c6462cf8]
> x86: Remove looking for setting of _PAGE_PAT_LARGE in pageattr.c
> # possible first bad commit: [2a3746984c98b17b565e6a2c2bbaaaef757db1b4]
> x86: Use new cache mode type in track_pfn_remap() and track_pfn_insert()

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/