Re: [PATCH] [Request for inclusion] Filesystem in Userspace

From: Avi Kivity
Date: Tue Nov 30 2004 - 16:42:17 EST


Miklos Szeredi wrote:

you're describing the deadlock here: all memory is full, no process which allocates memory can make any progress.



Yes they, can: the allocation will fail, function will return -ENOMEM,
malloc will return NULL, pagefault will fail with OOM. This is
progress, though not the best sort. It is most certainly _not_ a
deadlock.



Looks like we are in a deadlock here :)

However you choose to call it, it is unacceptable IMO.

This is not a true oom situation: there can be plenty of memory in
dirty pagecache which we could reclaim if we had that tiny bit of
reserve memory.



The amount of reserved memory that would be needed depends upon the
filesystem. Some filesystems would need only very little to be able
to free some memory, some would need a lot (e.g. a bzip2 compressing
filesystem). There's no magic solution with reserving memory.


So the userspace filesystem would pass that amount to the kernel. It's not pretty, but it is workable.

And this is not unique to userspace filesystems, as Rik van Riel
pointed out earlier, network filesystems are also prone to deadlock:

http://lkml.org/lkml/2004/11/27/81



This looks like a bug to me. Maybe jiggling the thresholds would help.

I looked at ramfs, it isn't even limited. You can easily crash your
system just by filling it up with data, but no deadlock will happen.




Right. But ramfs doesn't call a userspace process which calls the kernel right back.



Doesn't matter one little whit. The end result is the same: Out Of
Memory, which is _not_ equivalent to deadlock. Please think it over.


The situation with userspace filesystems is:

some process allocates memory, blocking on kswapd as memory is full
kswapd calls userspace filesystem to free memory
userspace filesystem calls kernel, which allocates memory and blocks on kswapd
eventually all processes in the system block on kswapd

I have observed (and fixed) this on a real system.

with ramfs, once it accounts for memory, there would be no deadlock and no oom.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/