Re: Crash in MM code in v4.4.y, v4.9.y with TRANSPARENT_HUGEPAGE enabled

From: Michal Hocko
Date: Mon Aug 20 2018 - 15:12:58 EST


On Mon 20-08-18 11:03:53, Andi Kleen wrote:
> On Mon, Aug 20, 2018 at 06:29:38PM +0200, Michal Hocko wrote:
> > On Fri 17-08-18 15:27:33, Guenter Roeck wrote:
> > > Hi,
> > >
> > > the following crash is seen in v4.4.148, v4.4.149, v4.9.120, and v4.9.121
> > > with CONFIG_TRANSPARENT_HUGEPAGE=y, CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.
> >
> > Could you try to apply fd7e315988b7 ("x86/mm: Simplify p[g4um]d_page()
> > macros"). I do not see it in stable 4.4 tree and it has been introduced
> > much later in 4.14. This one gave us quite some headache because it is
> > soooo easy to overlook.
>
> Good catch!
>
> I tested that with 4.9 and backporting the patch indeed fixes the
> syzcaller test case running in a KVM VM. Backported patch appended.
>
> Should probably go into 4.4 and 4.9.
>
> Cannot explain the 4.17 report unfortunately.

I haven't seen that one yet and likely won't get to it tomorrow as well
but I would start looking for a direct pte_val usage. We have had som
out of tree xen code which was doing exactly this. Not really easy to
find by a code inspection.
--
Michal Hocko
SUSE Labs