Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup

From: Sergey Senozhatsky
Date: Fri Jun 03 2016 - 09:38:25 EST


On (06/03/16 12:05), Michal Hocko wrote:
> > > RIP collect_mm_slot() + 0x42/0x84
> > > khugepaged
> >
> > So is this really collect_mm_slot called directly from khugepaged or is
> > some inlining going on there?

inlining I suppose.

> > > prepare_to_wait_event
> > > maybe_pmd_mkwrite
> > > kthread
> > > _raw_sin_unlock_irq
> > > ret_from_fork
> > > kthread_create_on_node
> > >
> > > collect_mm_slot() + 0x42/0x84 is
> >
> > I guess that the problem is that I have missed that __khugepaged_exit
> > doesn't clear the cached khugepaged_scan.mm_slot. Does the following on
> > top fixes that?
>
> That wouldn't be sufficient after a closer look. We need to do the same
> from khugepaged_scan_mm_slot when atomic_inc_not_zero fails. So I guess
> it would be better to stick it into collect_mm_slot.

Michal, I'll try to test during the weekend (away from the affected box
now), but in the worst case it can as late as next Thursday (gonna travel
next week).

-ss