Re: d_lookup: Unable to handle kernel paging request

From: Will Deacon
Date: Wed Jun 19 2019 - 13:09:59 EST


Hi all,

On Wed, Jun 19, 2019 at 05:28:02PM +0100, Al Viro wrote:
> [arm64 maintainers Cc'd; I'm not adding a Cc to moderated list,
> sorry]

Thanks for adding us.

> On Wed, Jun 19, 2019 at 02:42:16PM +0200, Vicente Bergas wrote:
>
> > Hi Al,
> > i have been running the distro-provided kernel the last few weeks
> > and had no issues at all.
> > https://archlinuxarm.org/packages/aarch64/linux-aarch64
> > It is from the v5.1 branch and is compiled with gcc 8.3.
> >
> > IIRC, i also tested
> > https://archlinuxarm.org/packages/aarch64/linux-aarch64-rc
> > v5.2-rc1 and v5.2-rc2 (which at that time where compiled with
> > gcc 8.2) with no issues.
> >
> > This week tested v5.2-rc4 and v5.2-rc5 from archlinuxarm but
> > there are regressions unrelated to d_lookup.
> >
> > At this point i was convinced it was a gcc 9.1 issue and had
> > nothing to do with the kernel, but anyways i gave your patch a try.
> > The tested kernel is v5.2-rc5-224-gbed3c0d84e7e and
> > it has been compiled with gcc 8.3.
> > The sentinel you put there has triggered!
> > So, it is not a gcc 9.1 issue.
> >
> > In any case, i have no idea if those addresses are arm64-specific
> > in any way.
>
> Cute... So *all* of those are in dentry_hashtable itself. IOW, we have
> these two values (1<<24 and (1<<24)|(0x88L<<40)) cropping up in
> dentry_hashtable[...].first on that config.

Unfortunately, those values don't jump out at me as something particularly
meaningful on arm64. Bloody weird though.

> There shouldn't be any pointers to hashtable elements other than ->d_hash.pprev
> of various dentries. And ->d_hash is not a part of anon unions in struct
> dentry, so it can't be mistaken access through the aliasing member.
>
> Of course, there's always a possibility of something stomping on random places
> in memory and shitting those values all over, with the hashtable being the
> hottest place on the loads where it happens... Hell knows...
>
> What's your config, BTW? SMP and DEBUG_SPINLOCK, specifically...

I'd also be interesting in seeing the .config (the pastebin link earlier in
the thread appears to have expired). Two areas where we've had issues
recently are (1) module relocations and (2) CONFIG_OPTIMIZE_INLINING.
However, this is the first report I've seen of the sort of crash you're
reporting.

Will