Re: d_lookup: Unable to handle kernel paging request

From: Al Viro
Date: Wed May 22 2019 - 09:56:15 EST


On Wed, May 22, 2019 at 12:40:55PM +0200, Vicente Bergas wrote:
> Hi,
> since a recent update the kernel is reporting d_lookup errors.
> They appear randomly and after each error the affected file or directory
> is no longer accessible.
> The kernel is built with GCC 9.1.0 on ARM64.
> Four traces from different workloads follow.

Interesting... bisection would be useful.

> This trace is from v5.1-12511-g72cf0b07418a while untaring into a tmpfs
> filesystem:
>
> Unable to handle kernel paging request at virtual address 0000880001000018
> user pgtable: 4k pages, 48-bit VAs, pgdp = 000000007ccc6c7d
> [0000880001000018] pgd=0000000000000000

Attempt to dereference 0x0000880001000018, which is not mapped at all?

> pc : __d_lookup+0x58/0x198

... and so would objdump of the function in question.

> This trace is from v5.2.0-rc1:
> Unable to handle kernel paging request at virtual address 0000880001000018
[apparently identical oops, modulo the call chain to d_lookup(); since that's
almost certainly buggered data structures encountered during the hash lookup,
exact callchain doesn't matter all that much; procfs is the filesystem involved]

> This trace is from v5.2.0-rc1 while executing 'git pull -r' from f2fs. It
> got repeated several times:
>
> Unable to handle kernel paging request at virtual address 0000000000fffffc
> user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000092bdb9cd
> [0000000000fffffc] pgd=0000000000000000
> pc : __d_lookup_rcu+0x68/0x198

> This trace is from v5.2.0-rc1 while executing 'rm -rf' the directory
> affected from the previous trace:
>
> Unable to handle kernel paging request at virtual address 0000000001000018

... and addresses involved are

0000880001000018
0000000000fffffc
0000000001000018

AFAICS, the only registers with the value in the vicinity of those addresses
had been (in all cases so far) x19 - 0000880001000000 in the first two traces,
0000000001000000 in the last two...

I'd really like to see the disassembly of the functions involved (as well as
.config in question).