Re: WARN_ON_ONCE(1) in iomap_dio_actor()

From: Dave Chinner
Date: Thu Aug 13 2020 - 01:44:33 EST


On Mon, Aug 10, 2020 at 10:03:03PM -0400, Qian Cai wrote:
> On Sun, Jul 26, 2020 at 04:24:12PM +0100, Christoph Hellwig wrote:
> > On Fri, Jul 24, 2020 at 02:24:32PM -0400, Qian Cai wrote:
> > > On Fri, Jun 19, 2020 at 05:17:47PM -0700, Matthew Wilcox wrote:
> > > > On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> > > > > Running a syscall fuzzer by a normal user could trigger this,
> > > > >
> > > > > [55649.329999][T515839] WARNING: CPU: 6 PID: 515839 at fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
> > > > ...
> > > > > 371 static loff_t
> > > > > 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> > > > > 373 void *data, struct iomap *iomap, struct iomap *srcmap)
> > > > > 374 {
> > > > > 375 struct iomap_dio *dio = data;
> > > > > 376
> > > > > 377 switch (iomap->type) {
> > > > > 378 case IOMAP_HOLE:
> > > > > 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> > > > > 380 return -EIO;
> > > > > 381 return iomap_dio_hole_actor(length, dio);
> > > > > 382 case IOMAP_UNWRITTEN:
> > > > > 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> > > > > 384 return iomap_dio_hole_actor(length, dio);
> > > > > 385 return iomap_dio_bio_actor(inode, pos, length, dio, iomap);
> > > > > 386 case IOMAP_MAPPED:
> > > > > 387 return iomap_dio_bio_actor(inode, pos, length, dio, iomap);
> > > > > 388 case IOMAP_INLINE:
> > > > > 389 return iomap_dio_inline_actor(inode, pos, length, dio, iomap);
> > > > > 390 default:
> > > > > 391 WARN_ON_ONCE(1);
> > > > > 392 return -EIO;
> > > > > 393 }
> > > > > 394 }
> > > > >
> > > > > Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
> > > > > it contains a few pread64() calls until this happens,
> > > >
> > > > It _shouldn't_ be able to happen. XFS writes back ranges which exist
> > > > in the page cache upon seeing an O_DIRECT I/O. So it's not supposed to
> > > > be possible for there to be an extent which is waiting for the contents
> > > > of the page cache to be written back.
> > >
> > > Okay, it is IOMAP_DELALLOC. We have,
> >
> > Can you share the fuzzer? If we end up with delalloc space here we
> > probably need to fix a bug in the cache invalidation code.
>
> Here is a simple reproducer (I believe it can also be reproduced using xfstests
> generic/503 on a plain xfs without DAX when SCRATCH_MNT == TEST_DIR),
>
> # git clone https://gitlab.com/cailca/linux-mm
> # cd linux-mm; make
> # ./random 14

Ok:

file.fd_write = safe_open("./testfile", O_RDWR|O_CREAT);
....
file.fd_read = safe_open("./testfile", O_RDWR|O_CREAT|O_DIRECT);
....
file.ptr = safe_mmap(NULL, fsize, PROT_READ|PROT_WRITE, MAP_SHARED,
file.fd_write, 0);

So this is all IO to the same inode....

and you loop

while !done {

do {
rc = pread(file.fd_read, file.ptr + read, fsize - read,
read);
if (rc > 0)
read += rc;
} while (rc > 0);

rc = safe_fallocate(file.fd_write,
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
0, fsize);
}

On two threads at once?

So, essentially, you do a DIO read into a mmap()d range from the
same file, with DIO read ascending and the mmap() range descending,
then once that is done you hole punch the file and do it again?

IOWs, this is a racing page_mkwrite()/DIO read workload, and the
moment the two threads hit the same block of the file with a
DIO read and a page_mkwrite at the same time, it throws a warning.

Well, that's completely expected behaviour. DIO is not serialised
against mmap() access at all, and so if the page_mkwrite occurs
between the writeback and the iomap_apply() call in the dio path,
then it will see the delalloc block taht the page-mkwrite allocated.

No sane application would ever do this, it's behaviour as expected,
so I don't think there's anything to care about here.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx