Re: Sleeping BUG in khugepaged for i586

From: Andrew Morton
Date: Mon Jun 05 2017 - 17:44:05 EST


On Sat, 3 Jun 2017 14:24:26 -0500 Larry Finger <Larry.Finger@xxxxxxxxxxxx> wrote:

> I recently turned on locking diagnostics for a Dell Latitude D600 laptop, which
> requires a 32-bit kernel. In the log I found the following:
>
> BUG: sleeping function called from invalid context at mm/khugepaged.c:655
> in_atomic(): 1, irqs_disabled(): 0, pid: 20, name: khugepaged
> 1 lock held by khugepaged/20:
> #0: (&mm->mmap_sem){++++++}, at: [<c03d6609>]
> collapse_huge_page.isra.47+0x439/0x1240
> CPU: 0 PID: 20 Comm: khugepaged Tainted: G W
> 4.12.0-rc1-wl-12125-g952a068 #80
> Hardware name: Dell Computer Corporation Latitude D600
> /03U652, BIOS A05 05/29/2003
> Call Trace:
> dump_stack+0x76/0xb2
> ___might_sleep+0x174/0x230
> collapse_huge_page.isra.47+0xacf/0x1240
> khugepaged_scan_mm_slot+0x41e/0xc00
> ? _raw_spin_lock+0x46/0x50
> khugepaged+0x277/0x4f0
> ? prepare_to_wait_event+0xe0/0xe0
> kthread+0xeb/0x120
> ? khugepaged_scan_mm_slot+0xc00/0xc00
> ? kthread_create_on_node+0x30/0x30
> ret_from_fork+0x21/0x30
>
> I have no idea when this problem was introduced. Of course, I will test any
> proposed fixes.
>

Odd. There's nothing wrong with cond_resched() while holding mmap_sem.
It looks like khugepaged forgot to do a spin_unlock somewhere and we
leaked a preempt_count.