Re: [PATCH v4 2/2] ksm: provide support to use deferrable timers for scanner thread

From: Hugh Dickins
Date: Thu Sep 11 2014 - 09:01:36 EST


On Wed, 10 Sep 2014, Peter Zijlstra wrote:
>
> Does it make sense to drive both KSM and khugepage the same way we drive
> the numa scanning? It has the benefit of getting rid of these threads,
> which pushes the work into the right accountable context (the task its
> doing the scanning for) and makes the scanning frequency depend on the
> actual task activity.

I expect it would be possible: but more work than I'd ever find time
to complete myself, with uncertain benefit.

khugepaged would probably be easier to convert, since it is dealing
with independent mms anyway. Whereas ksmd is establishing sharing
between unrelated mms, so cannot deal with single mms in isolation.

But what's done by a single daemon today, could be passed from task
to task under mutex instead; with probably very different handling
of KSM's "unstable" tree (at present the old one is forgotten at the
start of each cycle, and the new one rebuilt from scratch: I expect
that would have to change, to removing rb entries one by one).

How well it would work out, I'm not confident to say. And I think
we shall need an answer to the power question sooner than we can
turn the design of KSM on its head. Vendors will go with what works
for them, never mind what our priniciples dictate.

Your suggestion of following the NUMA scanning did make me wonder
if I could use task_work: if there were already a re-arming task_work,
I could probably use that, and escape your gaze :) But I don't think
it exists at present, and I don't think it's an extension that would
be welcomed, and I don't think it would present an efficient solution.

The most satisfying solution aesthetically, would be for KSM to write
protect the VM_MERGEABLE areas at some stage (when they "approach
stability", whatever I mean by that), and let itself be woken by the
faults (and if there are no write faults on any of the areas, then
there is no need for it to be awoken).

But I think that all those new faults would pose a very significant
regression in performance.

I don't have a good idea of where else to hook in at present.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/