Re: [RFC v7 00/11] Support vrange for anonymous page
From: John Stultz
Date: Mon Apr 15 2013 - 23:33:34 EST
On 04/14/2013 12:42 AM, Minchan Kim wrote:
Hi KOSAKI,
On Thu, Apr 11, 2013 at 11:01:11AM -0400, KOSAKI Motohiro wrote:
and adding new syscall invokation is unwelcome.
Sure. But one more system call could be cheaper than page-granuarity
operation on purged range.
I don't think vrange(VOLATILE) cost is the related of this discusstion.
Whether sending SIGBUS or just nuke pte, purge should be done on vmscan,
not vrange() syscall.
Again, please see the MADV_FREE. http://lwn.net/Articles/230799/
It does changes pte and page flags on all pages of the range through
zap_pte_range. So it would make vrange(VOLASTILE) expensive and
the bigger cost is, the bigger range is.
This haven't been crossed my mind. now try_to_discard_one() insert vrange
for making SIGBUS. then, we can insert pte_none() as the same cost too. Am
I missing something?
For your requirement, we need some tracking model to detect some page is
using by the process currently before VM discards it *if* we don't give
vrange(NOVOLATILE) pair system call(Look at below). So the tracking model
should be formed in vrange(VOLATILE) system call context.
To further clarify Minchan's note here, the reason its important for the
application to use vrange(NOVOLATILE), its really to help define _when
the range stops being volatile_.
In your libc hack to use vrange(), you see the benfit of not immediately
purging the memory as you do with MADV_DONTNEED. However, if the heap
grows again, and those address are re-used, nothing has stopped those
pages from continuing to be volatile. Thus the kernel could then decide
to purge those pages after they start to be used again, and you'd lose
data. I suspect that's not what you want. :)
Rik's MADV_FREE implementation is very similar to vrange(VOLATILE), but
has an implicit vrange(NOVOLATILE) on any page write. So by dirtying a
page, it stops the kernel from later purging it.
This MADV_FREE semantic works very well if you always want zerofill (as
in the case of malloc/free). But for other data, its important to know
something was lost (as a zero page could be valid data), and that's why
we provide the SIGBUS, as well as the purged notification on
vrange(NOVOLATILE).
In other-words, as long as you do a vrange(NOVOLATILE) when you grow the
heap again (before its used), it should be very similar to the MADV_FREE
behavior, but is more flexible for other use cases.
thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/