Re: running get_user_pages() from kernel thread

From: Hugh Dickins
Date: Tue Jun 16 2009 - 16:47:23 EST


On Tue, 16 Jun 2009, Izik Eidus wrote:
> Hugh Dickins wrote:
> >
> > Looks like Izik and I hit the same problem (otherwise running well):

Ran well enough overnight (with mm/mmap.c vm_flags hack to be merging
everything it could) that, to judge by your remarks below, I ought to
switch away from investigating and fixing and reviewing, to sending
you patches to bring us back in synch.

I don't think I have any problems with the madvise route, which weren't
already problems with the /dev/ksm route: there are oddities I'm eager
to look into (fork still raising questions for me), but nothing serious
enough to get in the way of resynching.

Andrea will be relieved to learn that I like mm_slot->touched very much:
I don't know (and now don't need to know) what was serving that purpose
in the /dev/ksm version.

> Good, This solve another issue that you probably dont hit beacuse you work
> with the madvise version:
> the .release call back of the file_operations strcture will call to:
> ksm_sma_release() that will call to remove_slot_from_hash_and_tree() that will
> do the silly break_cow() call (even that the prcoesses just die!!!)

Yes, I'd noticed before that break_cow() can be silly more work than
necessary, may need care in ordering things. But, if it's still any
issue, it's something that can be optimized later: you have a technique
that goes about it safely, that's good.

> Now beacuse that exit_mm() will set tsk->mm = NULL before the .release will
> get called, we will trigger this path even without the kernel thread context.
> (I prepred patch that just avoid the break_cow() from the .release context,
> but it isnt needed with this patch)
>
> (You shouldnt have that specific problem in the madvise() version beacuse we
> dont have this .release callback anymore, but we still do there useless
> break_cow() calls, considering the fact that the process just die, this
> break_cow() calls should be made only when the user ask specifically to stop
> merging pages i guess...,

Yes, and I'm thinking that although it's fine for madvise(,,MADV_MERGEABLE)
to be slow to get around to merging, probably madvise(,,MADV_UNMERGEABLE)
needs to have broken COW on any KSM pages before the call returns. I've
a suspicion that nobody will ever use MADV_UNMERGEABLE, outside of our
testing; and yet it's against my principles not to provide it, and if it
is used, then I think it needs to give that guarantee before returning.
But again, something we can fill in once we're in synch.

> I will send patch that will make it work more logical on top of your patches
> as soon as you send something)

Right, what would you like me to base against? What if you were to
send me a rollup patch against 2.6.30 of what your tree has now? Or
would you prefer to base against an mmotm? With or without your RFC
patches, or something close to them?

Once I have your base, whichever way you prefer it, then I can put
together a series of patches to adjust that to what I'm now working
with (mainly: the ksm.c end of it would be much as in your RFC though
with tidyups, and minus 4/4; whereas the madvise.c end of it I reworked).

The patches we send to Andrew for mmotm, later on, would be something
different, I believe (and something different from your git history):
there I think we'd be asking him to remove the KSM patches he already
has, and providing fresh equivalents based around the madvise interface
(so, for example, I think there would be no patch at all to mm/rmap.c,
all the changes made there earlier being later reverted).

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/