Re: [RFC v3] Support volatile range for anon vma

From: John Stultz
Date: Tue Dec 11 2012 - 13:45:22 EST


On 12/10/2012 06:34 PM, Minchan Kim wrote:
This still is [RFC v3] because just passed my simple test
with TCMalloc tweaking.

I hope more inputs from user-space allocator people and test patch
with their allocator because it might need design change of arena
management design for getting real vaule.

Changelog from v2

* Removing madvise(addr, length, MADV_NOVOLATILE).
* add vmstat about the number of discarded volatile pages
* discard volatile pages without promotion in reclaim path

This is based on v3.6.

- What's the madvise(addr, length, MADV_VOLATILE)?

It's a hint that user deliver to kernel so kernel can *discard*
pages in a range anytime.

- What happens if user access page(ie, virtual address) discarded
by kernel?

The user can see zero-fill-on-demand pages as if madvise(DONTNEED).

- What happens if user access page(ie, virtual address) doesn't
discarded by kernel?

The user can see old data without page fault.

- What's different with madvise(DONTNEED)?

System call semantic

DONTNEED makes sure user always can see zero-fill pages after
he calls madvise while VOLATILE can see zero-fill pages or
old data.
I still need to really read and understand the patch, but at a high level I'm not sure how this works. So does the VOLATILE flag get cleared on any access, even if the pages have not been discarded? What happens if an application wants to store non-volatile data in an area that was once marked volatile. If there was never memory pressure, it seems the volatility would persist with no way of removing it.

Either way, I feel that with this revision, specifically dropping the NOVOLATILE call and the SIGBUS optimization the Mozilla folks suggested, your implementation has drifted quite far from the concept I'm pushing. While I hope we can still align the underlying mm implementation, I might ask that you use a different term for the semantics you propose, so we don't add too much confusion to the discussion.

Maybe you could call it DONTNEED_DEFERRED or something?

In the meantime, I'll be reading your patch in detail and seeing how we might be able to combine our differing approaches.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/