Re: mapping user space buffer to kernel address space

From: Andrea Arcangeli (andrea@suse.de)
Date: Wed Oct 18 2000 - 08:23:17 EST


On Tue, Oct 17, 2000 at 09:42:36PM -0700, Linus Torvalds wrote:
> - get PTE entry, clear it out.
> - if PTE was dirty, add the page to the swap cache, and mark it dirty,
> but DON'T ACTUALLY START THE IO!
> - free the page.
>
> Basically, we removed the page from the virtual mapping, and it's now in
> the LRU queues, and marked dirty there.
>
> Then, we'd move the "writeout" part into the LRU queue side, and at that
> point I agree with you 100% that we probably should just delay it until
> there are no mappings available - is we'd only write out a swap cache
> entry if the count == 1 (ie it only exists in the swap cache), because
> before that is true there are other people marking it dirty.

This change makes sense and I agree it would cover the problem. However I
prefer to clarify that doing it for the swap cache as described is not nearly
enough to cover the mm corruption (everything that gets written via a memory
balancing mechanism should do the same).

Said that I think it would be possible to do it for SHM and shared mappings too.

However this still makes me wonder why should we unmap the pte of a page that
we can't free until we unmap_kiobuf? That's not as bad as having a
nopage+swapout dummy operations in the sound driver DMA page case, because
usually user-kiobufs are temporary just for the time of the DMA I/O, though.

> twice, but I think you see what I'm talking about on a conceptual level.

I see.

> See? THAT, in my opinion, is the clean way to handle this all.

Ok. I'm still not completly convinced that it's right to unmap a page "pinned"
(get_page()) on the physical layer but I think the above is conceptually a good
idea regardless of rawio, and if done everywhere it will avoid us to pin the
page in the pte. Only point left in not pinning the page in the pte is
performance wise that as said it's probably not big deal in real life as user
kiobufs usually stays there only for the duration of the I/O.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Oct 23 2000 - 21:00:13 EST