find_get_entries_tag regression bisected

From: Dan Williams
Date: Fri Feb 15 2019 - 21:08:40 EST


Hi Willy,

Piotr reports the following crash can be triggered on latest mainline:

EXT4-fs (pmem5): recovery complete
EXT4-fs (pmem5): mounted filesystem with ordered data mode. Opts: dax
------------[ cut here ]------------
kernel BUG at mm/pgtable-generic.c:127!
invalid opcode: 0000 [#1] SMP PTI
CPU: 28 PID: 1193 Comm: a.out Tainted: G W OE 4.19.0-rc5+ #2907
[..]
RIP: 0010:pmdp_huge_clear_flush+0x1e/0x80
[..]
Call Trace:
dax_writeback_mapping_range+0x473/0x8a0
? print_shortest_lock_dependencies+0x40/0x40
? jbd2_journal_stop+0xef/0x470
? ext4_fill_super+0x3071/0x3110
? __lock_is_held+0x4f/0x90
? __lock_is_held+0x4f/0x90
ext4_dax_writepages+0xed/0x2f0
do_writepages+0x41/0xe0
__filemap_fdatawrite_range+0xbe/0xf0
file_write_and_wait_range+0x4c/0xa0
ext4_sync_file+0xa6/0x4f0

I bisected this regression to commit c1901cd33cf4 "page cache: Convert
find_get_entries_tag to XArray". I suspect another case of pte vs pmd
confusion.

Below is the small reproducer from Piotr, it triggers in a qemu
environment with emulated pmem, but only with ext4 that I can see, but
I did not dig too deep. Let me know if anything jumps out to you. I'll
otherwise take a deeper look in the coming days.


#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <string.h>
#include <assert.h>

#define MB (1ULL << 20)

int
main(int argc, char *argv[])
{
int ret;
int fd;
off_t size = 2 * MB;

char *path = argv[1];

fd = open(path, O_RDWR | O_CREAT | O_EXCL, 0666);
assert(fd > 0);

ret = ftruncate(fd, size);
assert(ret == 0);

char *addr = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
assert(addr != MAP_FAILED);

memset((char*)addr, '0', 1);

ret = msync(addr + 4096, 1, MS_SYNC);
assert(ret == 0);

close(fd);

return 0;
}