Re: [PATCH] Use of getblk differs between locations

From: Anton Altaparmakov
Date: Tue Oct 11 2005 - 03:08:38 EST


On Tue, 2005-10-11 at 03:20 +0200, Mikulas Patocka wrote:
> >> I liked what linux-2.0 did in this case --- if the kernel was out of
> >> memory, getblk just took another buffer, wrote it if it was dirty and used
> >> it. Except for writeable loopback device (where writing one buffer
> >> generates more dirty buffers), it couldn't deadlock.
> >
> > Wouldn't it be better if bread() were to return ERR_PTR(-EIO) or
> > ERR_PTR(-ENOMEM)? Big change.
>
> No. Out of memory condition can happen even under normal circumstances
> under lightly loaded system. Think of a situation when dirty file-mapped
> pages fill up the whole memory, now a burst of packets from network comes
> that fills up kernel atomic reserve, you have zero pages free --- and what
> now? --- returning ENOMEM and dropping dirty pages without writing them is
> wrong,

Oh, don't be stupid. You would just redirty the page and return. See
for example the writepage implementation in fs/ntfs/aops.c...

> deadlocking (filesystem waits until memory manager frees some pages
> and memory manager waits until filesystem writes the dirty pages) is wrong
> too.

That would never really happen. The probability of every single dirty
page on the system being tied into needing a memory allocation is very
close to zero... It is just a theoretical deadlock.

> The filesystem must make sure that it doesn't need any memory to do block
> allocation and data write. Linux-2.0 got this right, it could do getblk
> and bread even if get_free_pages constantly failed.

That is impossible to do for complex filesystems. Every filesystem
would have to pre-allocate masses of memory from all sorts of places
(you would need pages that can be plugged into the page cache and/or you
would need buffers and/or you would need bios which you are not allowed
to hold onto so you would already have a problem here, etc...) to be
able to do that and that would impact system performance greatly unless
you had ridiculous amount of ram to start with in which case you would
not need to bother with this complexity as you would never oom anyway.

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/