Re: [PATCH] [Request for inclusion] Filesystem in Userspace

From: Miklos Szeredi
Date: Sun Nov 21 2004 - 02:45:09 EST



> Ok, this one works, I agree... But it will be way slower than coda's
> file-backed approach, right?

No. In both cases there will be two memory copies:

- from userspace fs to pagecache
- from pagecache to read buffer

And vica versa for write. FUSE will have some overhead of context
switches, but in the optimal case (sequential read), the readahead
code will bundle read requests, and with a 128MByte read, the cost of
a context switch is probably infinitesimal.

> > In the second case there is no deadlock, because the memory subsystem
> > doesn't wait for data to be written. If the filesystem refuses to
> > write back data in a timely manner, memory will get full and OOM
> > killer will go to work. Deadlock simply cannot happen.
>
> Hmmm, so if userspace part is swapped out and data is dirtied
> "too quickly", OOM is practically guaranteed? That is not nice.

Yeah. I imagined, that the kernel won't assign all it's free pages to
file mappings, but people on the list enlightened me to the contrary.

There seem to be two kinds of solutions:

1) make the userspace part aware of the memory allocation problems

2) limit the number of pages that can be dirtied


I don't really like 1) because that really kills the point of
implementing the filesystem in userspace.

So I would go along the lines of 2). However there is no way to know
when pages are dirtied (ther is no fault), so accounting the dirty
pages exactly is not possible. However accounting the _writable_
pages should be possible with no overhead, since there is a fault when
the page of a mapping is first touched.

Limiting these pages for FUSE, would mean that the user could do
writable mmap, but with the performance penaly of having a limited
number of pages present in memory. If the userspace FS refuses to
write back data, the filesystem user will be blocked until the
writeback is completed. No deadlock (hopefully).

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/