Re: PROBLEM: oops spew with Linux 5.1.5 (NFS regression?)

From: Nick Bowler
Date: Mon Jun 03 2019 - 12:38:36 EST


On 2019-05-29, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> On Wed, May 29, 2019 at 1:14 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx>
> wrote:
>>
>> On Wed, 2019-05-29 at 11:10 -0400, Nick Bowler wrote:
>> > Hi,
>> >
>> > I upgraded to Linux 5.1.5 on one machine yesterday, and this morning
>> > I happened noticed a large amount of backtraces in the log. It appears
>> > that the system oopsed 62 times over a period of about 5 minutes,
>> > producing about half a megabyte of log messages, after which the
>> > messages stopped. No idea what action (if any) triggered these.
>> >
>> > However, other than the noise in the logs there is nothing obviously
>> > broken, but I thought I should report the spews anyway. I was
>> > running 5.0.9 previously and have not seen any similar errors. The
>> > first couple spews are appended. All 64 faults look very similar
>> > to these ones, with the same faulting address and the same
>> > rpc_check_timeout function at the top of the backtrace.
>>
>> OK, I think this is the same problem that Olga was seeing (Cced), and
>> it looks like I missed the use-after-free issue when the server returns
>> a credential error when she asked.
>
> I think this is actually different than what I encountered for the
> umount case but the trigger is the same -- failing validation.
>
> I tried to reproduce Nick's oops on 5.2-rc but haven't been able to
> (but I'm not confident I produced the right trigger conditions. will
> try 5.1).

OK, I think I found something that triggers this fault. This happens
when certain local users try to stat a file or directory on an nfs
mount. Presumably these UIDs do not have appropriate permissions on
the server but I'm not sure exactly (I do not control the server).

I can reproduce the oops with a command like this:

# su -s/bin/sh -c 'stat /path/to/nfs/file' problematic_user

which oopes every time (and SIGKILLs the stat command). (I have not yet
rebooted since the original report or tried with Trond's patch applied.
I will do that next, and also try 5.1.6).

Cheers,
Nick