Re: [RFC PATCH 1/2] mm: introduce bmap_walk()

From: Christoph Hellwig
Date: Tue Jun 20 2017 - 03:35:03 EST

On Mon, Jun 19, 2017 at 07:19:57PM +0100, Al Viro wrote:
> Speaking of iomap, what's supposed to happen when doing a write into what
> used to be a hole? Suppose we have a file with a megabyte hole in it
> and there's some process mmapping that range. Another process does
> write over the entire range. We call ->iomap_begin() and allocate
> disk blocks. Then we start copying data into those. In the meanwhile,
> the first process attempts to fetch from address in the middle of that
> hole. What should happen?

Right now the buffered iomap code expects delayed allocations.
So ->iomap_begin will only reserve block in memory, and not even
mark the blocks as allocated in the page / buffer_head. The fact
that the block is allocated is only propagated into the page buffer_head
on a page by page basis in the actor.

> Should the blocks we'd allocated in ->iomap_begin() be immediately linked
> into the whatever indirect locks/btree/whatnot we are using? That would
> require zeroing all of them first - otherwise that readpage will read
> uninitialized block. Another variant would be to delay linking them
> in until ->iomap_end(), but... Suppose we get the page evicted by
> memory pressure after the writer is finished with it. If ->readpage()
> comes before ->iomap_end(), we'll need to somehow figure out that it's
> not a hole anymore, or we'll end up with an uptodate page full of zeroes
> observed by reads after successful write().

Delayed blocks are ignored by the read code, so it will read 'through'

> The comment you've got in linux/iomap.h would seem to suggest the second
> interpretation, but neither it nor anything in Documentation discusses the
> relations with readpage/writepage...

I'll see if I can come up with some better documentation.