truncate shows non zero data beyond the end of the inode with MAP_SHARED

From: Andrea Arcangeli
Date: Wed Sep 15 2004 - 07:34:07 EST


Hello,

I've been told we're not posix compliant the way we handle MAP_SHARED
on the last page of the inode. Basically after we map the page into
userspace people can make the data beyond the i_size non-zero and we
should clear it in the transition from page_mapcount 1 -> 0. The bug
is that if you truncate-extend, the new data will not be guaranteed to
be zero.

msync + power outage and writing to the page with sys_write at the same
time it's being mapped (and in turn queueing it for pdflush writeout)
are the two worst offeners. To fix those we'd need to mark the pte
readonly, flush the tlb with a worst-case IPI broadcast, writepage, then
mark the pte read-write and flush the tlb again with another IPI
broadcast.

That is going to have a significant cost methinks. So maybe we shouldn't
fix it after all...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/