Re: reiserfs being part of the kernel: it's not just the code

From: Stephen C. Tweedie (sct@redhat.com)
Date: Tue Jun 06 2000 - 14:54:47 EST


Hi,

On Tue, Jun 06, 2000 at 10:41:37AM -0700, Hans Reiser wrote:
>
> There is no need to delay reiserfs integration into 2.4 to accomplish a
> journaling API in 2.5.

It wasn't a journaling API we were talking about for this. The problem
is much more central to the VM than that --- basically, the VM currently
assumes that any existing page can be evicted from memory with very
little extra work. It just isn't prepared for the situation that you
have with transactions, where you can't flush any of the existing dirty
data to disk without first waiting for the transaction to proceed to a
consistent, checkpointable state.

We've had a lot of trouble in the past even just with ext2 creating too
many dirty buffers. That gets a lot worse if you have multiple
transactional filesystems in memory. It's not the journaling itself, but
the transactional requirements which are the problem --- basically the
VM cannot do _anything_ about individual pages which are pinned by a
transaction, but rather we need a way to trigger a filesystem flush,
AND to prevent more dirtying of pages by the filesystem (these are two
distinct problems), or we just lock up under load on lower memory
boxes.

A reservation API which lets all transactional filesystems reserve
the right to dirty a certain number of pages in advance of actually
needing them is really needed to avoid such lockups. The reservation
call can stall if the memory limit has been reached, providing flow
control to the filesystem; and a notification list can start committing
and flushing older transactions when that happens.

Cheers,
 Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 21:00:26 EST