Re: Question regarding concurrent accesses through block device and fs

From: Francis Moreau
Date: Mon Mar 02 2009 - 08:30:31 EST


Nick Piggin <nickpiggin@xxxxxxxxxxxx> writes:

> On Monday 02 March 2009 08:07:30 Francis Moreau wrote:
>> This is the case where I can't find when the metadata are actually
>> written back to the disk by the flushers. I looked at
>> writback_inodes() but I fail to find this out.
>>
>> Could you point out the place in the code where this happen ?
>
> I guess it picks them up via their block device inodes.

Probably but I don't find the actual place.

I looked at the place where page are normally written back to disk (ie
in background_writeout()) but I can see only the writeback of data, not
metadata...

>> This sounds very weird to me but I need to learn how things work
>> before doing any serious comments.
>
> Why would they? They just operate on their metadata, and the buffer
> cache is basically a transparent writeback cache to them.

Well the fact that metadata are written back to disk at an unknown point
in the time means that we don't know in which order metadata and data
are written. So it means that data can be written before or after
metadata or they can be mixed up.

And this sounds just weird to me. But as I said I'm just a noob so I
need to think and study more on this area and I really have to see where
the actual writes of metadata happen in the code.

> In the same way, an application doesn't really know or care when
> exactly its data is under writeback.

Except when dealing with metadata of the fs, we can corrupt the whole
thing, I think.

> unmap_underlying_metadata is the important exception because Linux
> pagecache otherwise doesn't have a good way to keep pagecache of
> different mappings coherent. So if a block switches from buffercache
> to file mapping, it needs to be made coherent.
>
> When switching back the other way, the truncate code actually makes
> sure of this, that there won't be blocks under writeout after
> being deallocated.
>
> Things do get more complicated with journalling file systems.
>

I think I'll just forget about them, things are currently enough
complicated to make them more obscure ;)

thanks
--
Francis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/