Re: [ghes_copy_tofrom_phys] BUG: sleeping function called from invalid context at mm/page_alloc.c:4150
From: Mark Rutland
Date: Tue Oct 31 2017 - 08:30:00 EST
On Tue, Oct 31, 2017 at 10:38:33AM +0000, Will Deacon wrote:
> On Mon, Oct 30, 2017 at 04:14:15PM -0400, Tyler Baicar wrote:
> > On 10/30/2017 1:46 PM, Linus Torvalds wrote:
> > >On Mon, Oct 30, 2017 at 10:20 AM, Linus Torvalds
> > ><torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >>I will add a "might_sleep()" to ioremap_page_range() itself, so that
> > >>we get this warning more reliably and much eailer. Right now it has
> > >>been hidden by the fact that most of the time the time the page tables
> > >>may be already allocated, but even then it's broken.
> > >Done. It doesn't report anything for me, so _hopefully_ the GHES
> > >driver is the only one that does games like this. See commit
> > >b39ab98e2f47 ("Mark 'ioremap_page_range()' as possibly sleeping").
> > >
> > >So now it should hopefully warn about this bad usage of page remapping
> > >reliably, at least if you have CONFIG_DEBUG_ATOMIC_SLEEP enabled.
> > >
> > >Can somebody who has a working GHES setup (although Borislav seems to
> > >think no such thing exists) verify?
> > Hello Linus,
> >
> > I have verified that this flags the error for me every time ghes_proc() is used.
> > But I also see it flagged in ARM PMU code:
> >
> > [ÂÂÂ 7.381153] BUG: sleeping function called from invalid context at mm/slab.h:420
> > [ÂÂÂ 7.387625] in_atomic(): 0, irqs_disabled(): 128, pid: 11, name: cpuhp/0
> > [ÂÂÂ 7.394310] CPU: 0 PID: 11 Comm: cpuhp/0 Not tainted 4.14.0-rc7 #46
> > [ÂÂÂ 7.400559] Hardware name: Qualcomm Qualcomm Centriq(TM) 2400 Development
> > Platform
> > [ÂÂÂ 7.414361] Call trace:
> > [ÂÂÂ 7.416797] [<ffff000008088b28>] dump_backtrace+0x0/0x270
> > [ÂÂÂ 7.422175] [<ffff000008088dbc>] show_stack+0x24/0x30
> > [ÂÂÂ 7.427211] [<ffff0000090d01f0>] dump_stack+0x98/0xb8
> > [ÂÂÂ 7.432246] [<ffff00000810118c>] ___might_sleep+0x104/0x128
> > [ÂÂÂ 7.437799] [<ffff000008101208>] __might_sleep+0x58/0x90
> > [ÂÂÂ 7.443097] [<ffff000008254a7c>] kmem_cache_alloc_trace+0x224/0x280
> > [ÂÂÂ 7.449347] [<ffff000008e9c938>] armpmu_alloc+0x30/0x168
> > [ÂÂÂ 7.454639] [<ffff000008e9d15c>] arm_pmu_acpi_cpu_starting+0x114/0x148
> > [ÂÂÂ 7.461151] [<ffff0000080d0f30>] cpuhp_invoke_callback+0xb8/0x760
> > [ÂÂÂ 7.467226] [<ffff0000080d1ec4>] cpuhp_thread_fun+0xa4/0x1b8
> > [ÂÂÂ 7.472872] [<ffff0000080f661c>] smpboot_thread_fn+0x174/0x250
> > [ÂÂÂ 7.478684] [<ffff0000080f18ec>] kthread+0x114/0x140
> > [ÂÂÂ 7.483632] [<ffff000008084774>] ret_from_fork+0x10/0x1c
>
> I know Mark was doing some fixes in the ACPI notifier code here, so I've
> added him to CC.
Sorry for the delay on this; I have a rather hideous fix that I'll clean
up and post shortly.
Thanks,
Mark.