Re: Hibernate resume bug around 3,18-rc2 - Full PAT support

From: Luis R. Rodriguez
Date: Tue Nov 24 2015 - 17:03:15 EST


On Tue, Nov 24, 2015 at 11:36:54AM +0200, vasvir@xxxxxxxxxxxxxxxxx wrote:
> > Let's try to speed up reproducing this.
> >
> > I have a hunch perhaps this might be related to some BIOS controlled
> > MTRRs and a mismatch which then enables the kernel to think that a type
> > of MTRR write might be OK, but in fact its not. Due to the work load
> > description of this perhaps this could be related to fan control and BIOS
> > control on them and against some other device MTRR. More on this suspicion
> > on another thread where you provide more logs.
> >
> > On a kernel that you know fails can you try replacing this work load by
> > making
> > you CPU crawl to its knees quickly, perhaps 'make -j' on Linux building
> > for 2,
> > 4, 8, 16, minutes and then hit CTRL C to continue to hibernation to see if
> > making the CPU fan trigger would accelerate the issue. If 'make -j' is
> > too nuts
> > to the point you can't even CTRL C it, try 'make -j 16' . Note that if
> > this is
> > true then that means a hot CPU could still trigger CPU fan controls on on
> > a
> > fresh boot if the previous boot was CPU intensive.
>
> OK that nailed it - with kernel 4.3 a known "bad" kernel I was able to
> reproduce it in the second hibernate/resume cycle.

Great, glad we could reduce the amount of time to reproduce to what seems
to be a few minutes now.

> Here is what I did in my own words so you can spot inconsistencies.
>
> I started a kernel compile with make -j 32. My computer was very
> responsive which is an impressive feat by the way.
> In a second tab in my Konsole (I am running KDE) I run $watch sensors. I
> watched the temperature of the cores to go from 38 to ~70 and the cpu fan
> from ~1630 to ~1900. Then the first time I hit Ctrl+C - stopped the
> compilation and hibernated from the KDE. I always hibernate from the KDE
> start menu. Previously I had made some tests where I was hibernating from
> the VT console (although sddm may was running in VT7) and I have managed
> to reproduce it - so (in my mind) it was not graphics mode specific. From
> that point I am always hibernating from KDE.

Come to think of it, the mtrr_add() and/or ioremap_wc() calls would be
triggered on driver initialization, that is on probe / boot time, so if this
issue you are running into is a clash of the BIOS's own notion of what is
set for an MTRR type and later another driver's desired MTRR desired type
(or equivalent PAT type) then the issue could be triggered just with
boot time / hibernation / resume time without much interaction at least
on the graphics front.

> The first time it worked. For the second time I thought - why to hit
> Ctrl+C let's try to hibernate with the compilation running - and it
> failed.

OK. How long did you leave the machine on idle before resuming?

Can you try on a fresh boot to bring up temperature to ~70 and while its still
compiling hibernate and see if that triggers it ? If we can reduce it to only
one hibernate that should reduce time to troubleshoot, it is also just puzzling
you'd need to hibernate twice to reproduce this issue.

> Now I don't know if it failed because it was the second cycle or
> because the load of the compilation was there or because of the
> temperature controlled fan register you mentioned.

If its fan related one test could be to hibertane on a fresh boot once fan
control is one, let it sit to cool, and then resume. Vs just resuming right
away. Ie: determine if we need fan control to be idle upon resume or not,
also how many times does fan control have to go on / off before you can
reproduce.

> Then I repeated the test with a known good kernel 3.18 (which should be
> 773fed910d41e443e495a6bfa9ab1c2b7b13e012 according to my git bisect logs -
> I have a problem there - see below) and it survived the same test
> (hibernate two times with temperature being ~70).
>
>
> > If this doesn't do it lets try forcing an MTRR capable driver, say
> > graphics is
> > the obvious target, try perhaps some 3D stuff or a screen saver prior to
> > hibernation. Note that even if you boot nomtrr the BIOS may still use
> > MTRRs,
> > and PAT use on Linux could assume MTRR is not being used on drivers but
> > the
> > BIOS may still do something behind the scenes. This is actually one reason
> > why
> > we can't exactly remove MTRR support from Linux, since the BIOS may still
> > do
> > some wacky stuff with MTRRs, one example of such I was given was CPU can
> > control might use WC MTRRs, so the kernel must be aware of this, even if
> > no
> > MTRRs are ever used on the Linux kernel at all -- this is the case now as
> > of
> > v4.3 and onwards.
> >
> > If that doesn't help speed it up , maybe try both screen saver + some 3D
> > stuff + cpu instensive stuff.
>
> I have 3D effects enabled in my KDE. Since your tip succeed to reproduce
> the problem early I didn't bother but If I should test 3D which program /
> benchmark should I run? glxgears?

As I mentioned above I can't think now of a reason why this should trigger
the issue if its mtrr related.

> > To help you speed up testing you can try reducing your build time by
> > reducing
> > the amount of crap you have to build:
> >
> > make localmodconfig
> >
> > That should only build things your kernel has loaded as modules or is
> > already
> > enabled (=y).
> >
>
> Thanks for the tip. I don't want to change that right now. I don't mind
> waiting a little bit because I a get a deb with the kernel and can retest
> a known configuration.

There is little risk in using it, you'll keep everything you had enabled as
built-in or modules. The only issue with this is if in between the commits
there was a kconfig symbol rename (driver rename), but I really don't
expect you to run into this as an issue.

> The other tip you gave if it actually works as it
> looks like working would give a great boost to the debugging cycle to
> actually make me the bottleneck.

Sure.

> > That is commit a023748d53c10850650fe86b1c4a7d421d576451
> > ("Merge branch 'x86-mm-for-linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
> >
> > Git is smart enough to tell you you've hit a merge commit and that all the
> > possible commits on that merge could be the issue. This is why you bisect
> > log shows a slew of commits. The next step is to bisect through the merge
> > and then bisect through that, this will then let us identify the exact
> > commit
> > that may have caused the issue.
> >
> > There are a few ways to do this, my preferred way is to "unfold" a merge
> > commit manually.
> >
> > To help keep thing separately (without affecting other tests you might
> > have on your other git tree and to avoid having to force you to loose
> > fresh object as you continue to build test on the other tree), I'd do
> > something like this:
>
> we will go with your preferred way - no question about that.
>
> >
> > mkdir ~/tmp
> > git clone ~/linux/.git linux-dev-test
>
> ok I have them in paralled ~/path/linux ~/path/linux-dev-test
>
> >
> > cd linux-dev-test
> >
> > Notice how if you do git log and search for
> > a023748d53c10850650fe86b1c4a7d421d576451
> > you'll see that the commit listed before this is
> > 773fed910d41e443e495a6bfa9ab1c2b7b13e012
> > ("Merge branches 'x86-platform-for-linus' and 'x86-uv-for-linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
> >
> > To be clear the list of commits you typically would see is just:
> >
> > a023748d53c10850650fe86b1c4a7d421d576451 - Merge branch 'x86-mm-for-linus'
> > of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > 773fed910d41e443e495a6bfa9ab1c2b7b13e012 - Merge branches
> > 'x86-platform-for-linus' and 'x86-uv-for-linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> >
> > We want to go down into the commits in the merge commit a023748d53c and
> > then zero out exactly which commit caused the issue. To do that on your
> > linux-dev-test directory you can do this:
>
> Thank you for the explanations. I thing I had understood that bit. git
> bisect visualized (gitk) helped me to grasp it. git log gave me a hard
> time with all these "hidden commits". Confirmation is good.
>
> >
> > git checkout -b test-merge-commit a023748d53c10850650fe86b1c4a7d421d576451
> >
> > That will create branch for testing based on the merge commit.
> > Now do this:
> >
> > git rebase -i 773fed910d41e443e495a6bfa9ab1c2b7b13e012
> >
> > Then don't pick any commit, just save and exit the editor, and then
> > git will actually "unfold" the merge commit for you -- it magically
> > will apply each commit in that merge commit linearly into your git
> > history.
> >
> > For instance the rebase should show 22 commits as follows, just
> > leave the defaults set as in bewlow and just hit (ESCT + :wq if
> > in vi):
> >
> > pick 96e70f832856 x86/mm: Avoid overlap the fixmap area on i386
> > pick 63e7b6d90c1e x86: mm: Re-use the early_ioremap fixed area
> > pick bdee237c0343 x86: mm: Use 2GB memory block size on large-memory
> > x86-64 systems
> > pick 281d4078bec3 x86: Make page cache mode a real type
> > pick c27ce0af896b x86: Use new cache mode type in include/asm/fb.h
> > pick 2d85ebf8e12e x86: Use new cache mode type in
> > drivers/video/fbdev/gbefb.c
> > pick 5006e45a6bc2 x86: Use new cache mode type in
> > drivers/video/fbdev/vermilion
> > pick 1c64216be164 x86: Use new cache mode type in arch/x86/pci
> > pick 2df58b6d3530 x86: Use new cache mode type in arch/x86/mm/init_64.c
> > pick d85f33342a0f x86: Use new cache mode type in asm/pgtable.h
> > pick 49a3b3cbdf16 x86: Use new cache mode type in mm/iomap_32.c

Everything below here should be tested given you say 49a3b3cbdf16 is good.

> > pick 2a3746984c98 x86: Use new cache mode type in track_pfn_remap() and
> > track_pfn_insert()
> > pick 102e19e1955d x86: Remove looking for setting of _PAGE_PAT_LARGE in
> > pageattr.c
> > pick c06814d8419a x86: Use new cache mode type in setting page attributes
> > pick b14097bd911c x86: Use new cache mode type in mm/ioremap.c
> > pick e00c8cc93c1a x86: Use new cache mode type in memtype related
> > functions
> > pick 87ad0b713b10 x86: Clean up pgtable_types.h
> > pick f439c429c320 x86: Support PAT bit in pagetable dump for lower levels
> > pick f5b2831d6541 x86: Respect PAT bit when copying pte values between
> > large and normal pages
> > pick bd809af16e3a x86: Enable PAT to use cache mode translation tables
> > pick 47591df50512 xen: Support Xen pv-domains using PAT
> > pick 0dbcae884779 x86: mm: Move PAT only functions to mm/pat.c
> >
>
> Ok I will do later tonight. But from my (git bisect) logs what I was
> expecting was
>
> # only skipped commits left to test
> # possible first bad commit: [a023748d53c10850650fe86b1c4a7d421d576451]
> Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> # possible first bad commit: [0dbcae884779fdf7e2239a97ac7488877f0693d9]
> x86: mm: Move PAT only functions to mm/pat.c
> # possible first bad commit: [47591df505129c9774af6cca2debf283a6e56ed7]
> xen: Support Xen pv-domains using PAT
> # possible first bad commit: [bd809af16e3ab1f8d55b3e2928c47c67e2a865d2]
> x86: Enable PAT to use cache mode translation tables
> # possible first bad commit: [f5b2831d654167d77da8afbef4d2584897b12d0c]
> x86: Respect PAT bit when copying pte values between large and normal
> pages
> # possible first bad commit: [f439c429c320981943f8b64b2a4049d946cb492b]
> x86: Support PAT bit in pagetable dump for lower levels
> # possible first bad commit: [87ad0b713b1034b6caf559976c35ce47f6d1d1e9]
> x86: Clean up pgtable_types.h
> # possible first bad commit: [e00c8cc93c1ac01ecd5049929a50fb47b62bb041]
> x86: Use new cache mode type in memtype related functions
> # possible first bad commit: [b14097bd911c2554b0b5271b3a6b2d84044d1843]
> x86: Use new cache mode type in mm/ioremap.c
> # possible first bad commit: [c06814d8419a74528500f85faf5fc01f67f8e7e6]
> x86: Use new cache mode type in setting page attributes
> # possible first bad commit: [102e19e1955d85f31475416b1ee22980c6462cf8]
> x86: Remove looking for setting of _PAGE_PAT_LARGE in pageattr.c
> # possible first bad commit: [2a3746984c98b17b565e6a2c2bbaaaef757db1b4]
> x86: Use new cache mode type in track_pfn_remap() and track_pfn_insert()

You are correct. Since you say 49a3b3cbdf16 was your first last good
commit then these are possible bad candidates.

> commit a023748d53c10850650fe86b1c4a7d421d576451 contains all the other
> commits listed below. The order is that newest is higher

Sure. All of those went int v3.19-rc1.

> Note that these commits listed above are untestable because the resulting
> kernels are not bootable. They hang in the second line of boot output in
> "Loading Ramdisk..." or something similar.
>
> my last good commit was 49a3b3cbdf1621678a39bd95a3e67c0f858539c7

That went in v3.19-rc1 as well.

> that
> means with git bisect I had already started zeroing in in
> a023748d53c10850650fe86b1c4a7d421d576451 since 49a3b3... was part of
> a0237...

This is correct. I missed that, thanks. That should reduce your bisect to
the above commits. We know that aa8f46878ab1a4a4e7b975b8fc8c398981e52986
("x86: mm: Move PAT only functions to mm/pat.c") was the last commit
part of the merge a023748d53c10850650fe86b1c4a7d421d576451, and we know
a023748d53c10850650fe86b1c4a7d421d576451 was a bad commit you can now
bisect on that tree between

x86: se new cache mode type in mm/iomap_32.c

and

x86: mm: Move PAT only functions to mm/pat.c

Note that rebasign will then change your commit IDs so the above
two named commits would appear differently on your tree so when
relaying information back to us just use the name if working
on the rebased tree.

Since you know the commit IDs now though you could also just
go back to your original tree and bisect between the two
commits now part of the same branch:

git bisect start a023748d53c10850650fe86b1c4a7d421d576451 49a3b3cbdf1621678a39bd95a3e67c0f858539c7

But for some reason git does a huge bisection here, I get 11 steps with
1814 revisions to test... we know there are only 12 revisions really
left to test though, for instance here are my commit IDs once I rebase
on the branch commit a023748d53c10850650fe86b1c4a7d421d576451 (as I said
notice how the commit IDs are now different):

5e9c2da70692 x86: Use new cache mode type in mm/iomap_32.c
e09f7c9da6b7 x86: Use new cache mode type in track_pfn_remap() and track_pfn_insert()
7077aded72a2 x86: Remove looking for setting of _PAGE_PAT_LARGE in pageattr.c
a019620d98ec x86: Use new cache mode type in setting page attributes
848159761245 x86: Use new cache mode type in mm/ioremap.c
06d664382eea x86: Use new cache mode type in memtype related functions
39ba0907d179 x86: Clean up pgtable_types.h
155c520125fe x86: Support PAT bit in pagetable dump for lower levels
f51279d0867f x86: Respect PAT bit when copying pte values between large and normal pages
ddbb181ad4ff x86: Enable PAT to use cache mode translation tables
7c67687de764 xen: Support Xen pv-domains using PAT
aa8f46878ab1 x86: mm: Move PAT only functions to mm/pat.c

So we really only need you test max 3 commits (log2 of 11 =~ 3).

So with my commit IDs I'd just do:

git bisect start aa8f46878ab1 5e9c2da70692

By rebasing on the commit prior to the merge commit this cuts down bisection it
down from 1814 revision to 5 revisions and from roughly 11 steps roughly 3
steps.

> So based on my git bisect so far my understanding is
> last good merge commit: 773fed910d41e443e495a6bfa9ab1c2b7b13e012

773fed910d41e443e495a6bfa9ab1c2b7b13e012 is the commit prior to the
merge commit, we have better information -- its just within the merge
commit so we need to trickle in there to look at it.

> last bad merge commit (next after 773fed...):
> a023748d53c10850650fe86b1c4a7d421d576451

That's the merge commit, but a merge commit is just fluff (meta data
to preserve annotations who how queued up code and then tossed it to
Linus later), we know the actual last commit that made code changes
was 0dbcae884779fdf7e2239a97ac7488877f0693d9 ("x86: mm: Move PAT
only functions to mm/pat.c") so we can use that.

> last good commit (inside a023748d53c10850650fe86b1c4a7d421d576451):
> 49a3b3cbdf1621678a39bd95a3e67c0f858539c7

That's much better, that zeroes inside the merge commit.

> all the others from 49a3b3cbdf1... to a023748d53c1... are
> untestable/unbootable kernels.

To be clear 49a3b3cbdf1 is bootable as its your last good
commit. If you are saying that after that and up to the
last commit of the merge commit (0dbcae884779) things are
not bootable that's a big issue indeed to help bisect this
further. If you can boot but it "hangs" on hibernate on
the merge commit a023748d53c1 I would suspect you should
at least be able to boot into the last commit of the merge
0dbcae884779f, can you confirm?

To be clear here are the list of commits we are reviewing:

a023748d53c1 - merge commit
0dbcae884779f - last commit of the merge commit
49a3b3cbdf162 - first good commit in the merge

Putting names on these:

a023748d53c1 - Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
0dbcae884779f - x86: mm: Move PAT only functions to mm/pat.c
49a3b3cbdf162 - x86: Use new cache mode type in mm/iomap_32.c

The actual list then of things we need you to find out what caused the issue:

git log --oneline 49a3b3cbdf162^1..0dbcae884779f
0dbcae884779 x86: mm: Move PAT only functions to mm/pat.c
47591df50512 xen: Support Xen pv-domains using PAT
bd809af16e3a x86: Enable PAT to use cache mode translation tables
f5b2831d6541 x86: Respect PAT bit when copying pte values between large and normal pages
f439c429c320 x86: Support PAT bit in pagetable dump for lower levels
87ad0b713b10 x86: Clean up pgtable_types.h
e00c8cc93c1a x86: Use new cache mode type in memtype related functions
b14097bd911c x86: Use new cache mode type in mm/ioremap.c
c06814d8419a x86: Use new cache mode type in setting page attributes
102e19e1955d x86: Remove looking for setting of _PAGE_PAT_LARGE in pageattr.c
2a3746984c98 x86: Use new cache mode type in track_pfn_remap() and track_pfn_insert()
49a3b3cbdf16 x86: Use new cache mode type in mm/iomap_32.c

Juergen is the author of all of these except 0dbcae884779 which is just a code
shift, so it sholud not affect run time for you.

> Please correct me if I am wrong - it will help me build the correct mental
> model.

Hope this helps. If there were issues with getting to boot some of the other
commits obviously some of the other commits fixed an issue as the merge
commit seems bootable -- so perhaps one of these commits is important to fix
the bootable issue you noted. Since Juergen is the author of all of the
relevant patches and he's been active on this thread I am confident we
should be able to get you a bootable kernel so you can help complete the
bisection.

Luis

> > You should see:

> > Successfully rebased and updated refs/heads/test-merge-commit.
> >
> > Now if you do git log you will see the above commits in linear
> > atomic history. You can now bisect this merge commit atomically, so do:
> >
> > git bisect 099487de0934e3d5e326666914a426af89a0868b
> > 773fed910d41e443e495a6bfa9ab1c2b7b13e012
> >
> > Note that this assumes that the commit prior to the merge commit is fine.
> > Is this true, can you confirm? (git checkout -b test-prior-merge-gtest
> > 773fed910d4,
> > build and see if it doesn't break there)
> >
> > If we know for sure 773fed910d4 did not break anything then the above
> > bisect
> > should let us zero in on the exact atomic commit ID that caused the issue.
> >
>
> Now the problem is that I tried twice to verify that
> 773fed910d41e443e495a6bfa9ab1c2b7b13e012 is indeed a good commit and I
> ended up with an unbootable kernel (hangs in "Loading Ramdisk..."). This
> is very disappointing and means that all my bisections so far are invalid.
> Very disappointing indeed but it's only a setback. I will figure it out
> and will make sure I have a valid setup for reproducible tests before I
> bother you again.
>
> Just for the record I did
> $git checkout 773fed910d41e443e495a6bfa9ab1c2b7b13e012
> $fakeroot make -j 4 CC=gcc-4.8 deb-pkg
>
> I will do as you suggest with the unfold of commits - but if my bisection
> was right (serious hints to the opposite exist) I stopped on
> unbootable/untestable kernels
>
> Thanks for the exhaustive mails with the explanations and the tips. They
> are much appreciated.
>
> Vassilis
>
>
>

--
Luis Rodriguez, SUSE LINUX GmbH
Maxfeldstrasse 5; D-90409 Nuernberg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/