Re: ramdisk corruption problems - was: RE: pivot_root and initrd kern el panic woes

From: Andrea Arcangeli (andrea@suse.de)
Date: Sun Dec 30 2001 - 19:05:37 EST


On Sun, Dec 30, 2001 at 01:33:24AM -0500, Alexander Viro wrote:
>
>
> On Sat, 29 Dec 2001, Andrew Morton wrote:
>
> > Would it be necessary to preallocate the holes at mmap() time? Mad
> > hand-waving: Could we not perform the instantiation at pagefault time,
> > and give the caller SIGBUS if we cannot allocate the blocks? Or if
> > there's an IO error, or quota exceeded.
>
> Allocation at mmap() Is Not Going To Happen. Consider it vetoed.
> There are applications that use mmap() on large and very sparse
> files.

it's exactly this kind of apps that will be screwed up by silent data
corruption. the point of the holes is to optimize performance and save
space, but they shouldn't introduce the possibilty of data corruption.

Note: I'm fine to introduce another way to notify the app about -ENOSPC,
-ENOSPC on mmap is the most obvious one, but we could still allow the
current "overcommit" behaviour with a kind of sigbus mentioned by
Andrew (possibly not sigbus though, since it has just well defined
semantics for MAP_SHARED, maybe they could be extended, anyways as said
this is only a matter of API). My point is only that some API should be
added because your mmap on sparse files are unreliable at the moment.

> > Question: can someone please define BH_New? Its lifecycle seems
> > very vague. We never actually seem to *clear* it anywhere for
> > ext2, and it appears that the kernel will keep on treating a
> > clearly non-new buffer as "new" all the time. ext3 explicitly
> > clears BH_New in get_block(), if it finds the block was already
> > present in the file. I did this because we need the newness
> > info for internal purposes.
>
> It should be reset when we submit IO. Breakage related to failing
> allocation is indeed not new, but that's a long story. And no,
> "allocate on mmap()" is not a fix.

it is a fix (assuming people will understand/expect this retval :), but
yes, the current overcommit behaviour has some value so I agree some
other async API ala sigbus may be more desiderable.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Dec 31 2001 - 21:00:23 EST