Re: Null-ptr-deref due to "sanitized pathwalk machinery (v4)"

From: Al Viro
Date: Wed Mar 25 2020 - 00:04:08 EST


On Tue, Mar 24, 2020 at 11:24:01PM -0400, Qian Cai wrote:

> > On Mar 24, 2020, at 10:13 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Tue, Mar 24, 2020 at 09:49:48PM -0400, Qian Cai wrote:
> >
> >> It does not catch anything at all with the patch,
> >
> > You mean, oops happens, but neither WARN_ON() is triggered?
> > Lovely... Just to make sure: could you slap the same couple
> > of lines just before
> > if (unlikely(!d_can_lookup(nd->path.dentry))) {
> > in link_path_walk(), just to check if I have misread the trace
> > you've got?
> >
> > Does that (+ other two inserts) end up with
> > 1) some of these WARN_ON() triggered when oops happens or
> > 2) oops is happening, but neither WARN_ON() triggers or
> > 3) oops not happening / becoming harder to hit?
>
> Only the one just before
> if (unlikely(!d_can_lookup(nd->path.dentry))) {
> In link_path_walk() will trigger.

> [ 245.767202][ T5020] pathname = /var/run/nscd/socket

Lovely. So
* we really do get NULL nd->path.dentry there; I've not misread the
trace.
* on the entry into link_path_walk() nd->path.dentry is non-NULL.
* *ALL* components should've been LAST_NORM ones
* not a single symlink in sight, unless the setup is rather unusual
* possibly not even a single mountpoint along the way (depending
upon the userland used)

And in the same loop we have
if (likely(type == LAST_NORM)) {
struct dentry *parent = nd->path.dentry;
nd->flags &= ~LOOKUP_JUMPED;
if (unlikely(parent->d_flags & DCACHE_OP_HASH)) {
struct qstr this = { { .hash_len = hash_len }, .name = name };
err = parent->d_op->d_hash(parent, &this);
if (err < 0)
return err;
hash_len = this.hash_len;
name = this.name;
}
}
upstream of that thing. So NULL nd->path.dentry *there* would've oopsed.
IOW, what we are hitting is walk_component() with non-NULL nd->path.dentry
when we enter it, NULL being returned and nd->path.dentry becoming NULL
by the time we return from walk_component().

Could you post the results of
stat / /var /var/run /var/run/nscd /var/run/nscd/socket
after the boot with working kernel? Also, is that "hit on every boot" or
stochastic? If it's the latter, I'd like to see the output of the same
thing on a successful boot of the same kernel, if possible...

Also, is the pathname always the same and if not, what other variants have
been observed?