Re: Oops in rpc_clnt_debugfs_register() from debugfs change

From: Greg Kroah-Hartman
Date: Tue Feb 12 2019 - 09:37:27 EST


On Tue, Feb 12, 2019 at 02:31:14PM +0000, David Howells wrote:
> I've bisected an oops that occurs in rpc_clnt_debugfs_register() trying to
> dereference a pointer with -EACCES in it. This is the causing commit, though
> I suspect the bug is in sunrpc expecting to see NULL rather than an error.
>
> ff9fb72bc07705c00795ca48631f7fffe24d2c6b is the first bad commit
> commit ff9fb72bc07705c00795ca48631f7fffe24d2c6b
> Author: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Date: Wed Jan 23 11:28:14 2019 +0100
>
> debugfs: return error values, not NULL
>
> When an error happens, debugfs should return an error pointer value, not
> NULL. This will prevent the totally theoretical error where a debugfs
> call fails due to lack of memory, returning NULL, and that dentry value
> is then passed to another debugfs call, which would end up succeeding,
> creating a file at the root of the debugfs tree, but would then be
> impossible to remove (because you can not remove the directory NULL).
>
> So, to make everyone happy, always return errors, this makes the users
> of debugfs much simpler (they do not have to ever check the return
> value), and everyone can rest easy.
> ...
>
> The attached oops occurs during boot from the gssproxy process in
> rpc_clnt_debugfs_register(). The code at this point is:
>
> 0xffffffff8195cbdd <+450>: mov 0x50(%rax),%rcx <--- oopsing
> 0xffffffff8195cbe1 <+454>: mov $0xffffffff821cc8ba,%rdx
> 0xffffffff8195cbe8 <+461>: mov $0x18,%esi
> 0xffffffff8195cbed <+466>: lea -0x30(%rbp),%rdi
> 0xffffffff8195cbf1 <+470>: callq 0xffffffff819db773 <snprintf>
>
> RAX is -EACCES.
>
> Looking in the source:
>
> len = snprintf(name, sizeof(name), "../../rpc_xprt/%s",
> xprt->debugfs->d_name.name);
>
> I think xprt->debugfs is the value in RAX.
>
> (gdb) p &((struct dentry *)0)->d_name.name
> $5 = (const unsigned char **) 0x50 <irq_stack_union+80>
>
> which matches the offset on the oopsing MOV instruction.
>
> This is with linus/master (aa0c38cf39de73bf7360a3da8f1707601261e518).

Ugh, yeah, I see the problem, sorry about that.

I wonder why the debugfs call is always failing, that's not good...

let me dig and see if I already have a patch for this...

greg k-h