Re: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:65

From: Pintu Agarwal
Date: Thu Feb 14 2019 - 04:11:42 EST


Hello Sai,

Thanks so much for your help.

On Thu, Feb 14, 2019 at 12:14 AM Sai Prakash Ranjan
<saiprakash.ranjan@xxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 2/13/2019 8:10 PM, Pintu Agarwal wrote:
> > OK thanks for your suggestions. sdm845-perf_defconfig did not work for
> > me. The target did not boot.
>
> Perf defconfig works fine. You need to enable serial console with below
> config added to perf defconfig.
>
> CONFIG_SERIAL_MSM_GENI_CONSOLE=y
>
Actually for me the kernel does not boot. It stuck in bootloader, with
"valid dtb not found".
I did not debug it further.
Anyways, we can look into this issue later.

> > However, disabling CONFIG_PANIC_ON_SCHED_BUG works, and I got a root
> > shell at least.
>
> >
> > But this seems to be a work around.
> > I still get a back trace in kernel logs from many different places.
> > So, it looks like there is some code in qualcomm specific drivers that
> > is calling a sleeping method from invalid context.
> > How to find that...
> > If this fix is already available in latest version, please let me know.
> >
>
> Seems like interrupts are disabled when down_write_killable() is called.
> It's not the drivers that is calling the sleeping method which can be
> seen from the log.
>
> [ 22.140224] [<ffffff88b8ce65a8>] ___might_sleep+0x140/0x188
> [ 22.145862] [<ffffff88b8ce6648>] __might_sleep+0x58/0x90 <---
> [ 22.151249] [<ffffff88b9d43f84>] down_write_killable+0x2c/0x80 <---
> [ 22.157155] [<ffffff88b8e53cd8>] setup_arg_pages+0xb8/0x208 <---
> [ 22.162792] [<ffffff88b8eb7534>] load_elf_binary+0x434/0x1298
> [ 22.168600] [<ffffff88b8e55674>] search_binary_handler+0xac/0x1f0
> [ 22.174763] [<ffffff88b8e560ec>]
> do_execveat_common.isra.15+0x504/0x6c8
> [ 22.181452] [<ffffff88b8e562f4>] do_execve+0x44/0x58
> [ 22.186481] [<ffffff88b8c84030>] run_init_process+0x38/0x48 <---
> [ 22.192122] [<ffffff88b9d3db1c>] kernel_init+0x8c/0x108
> [ 22.197411] [<ffffff88b8c83f00>] ret_from_fork+0x10/0x50
>
Yes, these are generic API, and I don't expect any changes in here.
We don't have this issue in another SOC 4.9 kernel.
Also I compared these APIs with mainline and there is no major changes here.
This is just one example.
This sleep issue is happening from other places as well.
May be one common similarity may be: during task loading, or switching.

> >
> > This at least proves that there is no issue in core ipipe patches, and
> > I can proceed.
>
> I doubt the *IPIPE patches*. You said you removed the configs, but all
> code are not under IPIPE configs and as I see there are lots of
> changes to interrupt code in general with ipipe.
>
We observed that this issue is happening in normal sdm845 kernel as
well (without ipipe/xenomai patches applied in another branch).
Another point is, we don't see this issue in another arm64 target such
as hikey, with same 4.9 kernel.

> So to actually confirm whether the issue is with qcom drivers or ipipe,
> please *remove ipipe patches (not just configs)* and boot.
> Also paste the full dmesg logs for these 2 cases(with and without
> ipipe).
>
hmmm. This will be little tough.
I will try to find sometime to point the exact cause, and share findings here.
Currently, I am debugging another issue.

Thanks for your help.


Regards