Re: mm: hung task (handle_pte_fault)

From: Naoya Horiguchi
Date: Wed Mar 28 2012 - 13:36:06 EST


On Tue, Mar 27, 2012 at 09:53:41PM -0700, Hugh Dickins wrote:
> On Wed, 28 Mar 2012, Sasha Levin wrote:
> > On Tue, Mar 27, 2012 at 1:17 AM, Andrew Morton
> > <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > The task is waiting for IO to complete against a page, and it isn't
> > > happening.
> > >
> > > There are quite a lot of things which could cause this, alas. VM,
> > > readahead, scheduler, core wait/wakeup code, IO system, interrupt
> > > system (if it happens outside KVM, I guess).
> > >
> > > So.... ugh. Hopefully someone will hit this in a situation where it
> > > can be narrowed down or bisected.
> >
> > I've only managed to reproduce it once, and was unable to get anything
> > useful out of it due to technical reasons.
> >
> > The good part is that I've managed to hit something similar (although
> > I'm not 100% sure it's the same problem as the one in the original
> > mail).
>
> I don't think this one has anything to do with the first you posted,
> but it does look like a good catch against current linux-next, where
> pagemap_pte_range() appears to do a spin_lock(&walk->mm->page_table_lock)
> which should have been removed by "thp: optimize away unnecessary page
> table locking". Some kind of mismerge perhaps: Horiguchi-san added to Cc.

Thanks for reporting.
This spin_lock() also exists in mainline, so we need a fix on it.
I'll post later for -stable tree.

Thanks,
Naoya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/