Re: [PATCHv2, RFC 20/30] ramfs: enable transparent huge page cache

From: Christoph Lameter
Date: Fri Apr 05 2013 - 09:52:28 EST


On Fri, 5 Apr 2013, Minchan Kim wrote:

> > >> How about add a knob?
> > >
> > >Maybe, volunteering?
> >
> > Hi Minchan,
> >
> > I can be the volunteer, what I care is if add a knob make sense?
>
> Frankly sepaking, I'd like to avoid new knob but there might be
> some workloads suffered from mlocked page migration so we coudn't
> dismiss it. In such case, introducing the knob would be a solution
> with default enabling. If we don't have any report for a long time,
> we can remove the knob someday, IMHO.

No Knob please. A new implementation for page pinning that avoids the
mlock crap.

1. It should be available for device drivers to pin their memory (they are
now elevating the ref counter which means page migration will have to see
if it can account for all references before giving up and it does that
quite frequently). So there needs to be an in kernel API, a syscall API as
well as a command line one. Preferably as similar as possible.

2. A sane API for marking pages as mlocked. Maybe part of MMAP? I hate the
command line tools and the APIs for doing that right now.

3. The reservation scheme for mlock via ulimit is broken. We have per
process constraints only it seems. If you start enough processes you can
still make the kernel go OOM.

4. mlock semantics are prescribed by posix which states that the page
stays in memory. I think we should stay with that narrow definition for
mlock.

5. Pinning could also mean that page faults on the page are to be avoided.
COW could occur on fork and page table entries could be instantated at
mmap/fork time. Pinning could mean that minor/major faults will not occur
on a page.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/