Re: writable swap cache explained (it's weird)

Stephen C. Tweedie (sct@redhat.com)
Thu, 30 Jul 1998 17:00:52 +0100


Hi,

On Wed, 29 Jul 1998 12:41:59 -0700 (PDT), Linus Torvalds
<torvalds@transmeta.com> said:

> On Wed, 29 Jul 1998, Bill Hawes wrote:
>>
>> The trick that xdos is using to get the shared writable pages uses
>> /proc/self/mem as a "file", so that a writable shared mapping appears to
>> be backed by a file. An strace shows the following calls:

> Oh, damn.

> I wanted to remove mmap() support from /proc/self/mem a long time ago
> exactly because it's almost impossible to maintain sane semantics for
> re-mapping it. I guess it's time to do this now (this problem is
> essentially impossible to fix without really ugly hacks - it can turn
> "private" pages into a very perverse sort of "shared" pages which is
> impossible to do any other way.

It's not particularly hard to fix: the current swap cache code can deal
with the non-writable case already (it maintains coherency between COW
pages from different processes as they move between swap and physical
memory).

The main thing standing in the way of fixing this is that the swap cache
does not really understand the concept of dirty/invalid pages (ie. write
pending and read pending). Currently, it assumes that any shared pages
are readonly. That way, it can do the write to disk as soon as the
initial association between the physical page and the swap entry is set
up, so that all swap cache pages are guaranteed to be consistent between
memory and disk.

To support shared writable pages we'd need to defer the write-to-disk
until the last dereference of the page from the VM, and that would
require us to be able to add the page to the swap cache but mark it as
unsynced. Not much else would have to be changed.

It's not an enormous job, but it's one which is almost certainly better
left to 2.3.

> And no, we can't just mark it unswappable, as that would open us up to
> some rather nasty security problems.

What about making it unswappable but restricting it to root processes
only? For now we can apply the same restriction that we make for
mlock().

--Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html