Re: [PATCH 0/4] Convert khugepaged to a task_work function
From: Alex Thorlton
Date: Thu Oct 30 2014 - 14:25:10 EST
On Thu, Oct 30, 2014 at 09:35:44AM +0100, Andi Kleen wrote:
> We already have too many VM tunables. Better would be to switch
> automatically somehow.
>
> I guess you could use some kind of work stealing scheduler, but these
> are fairly complicated. Maybe some simpler heuristics can be found.
That would be a better option in general, but (admittedly not having
thought about it much), I can't think of a good way to determine when to
make that switch. The main problem being that we're not really seeing a
negative performance impact from khugepaged, but some undesired
behavior, which always exists.
Perhaps we could make a decision based on the number of remote
allocations made by khugepaged? If we see a lot of allocations to
distant nodes, then maybe we tell khugepaged to stop running scans for a
particular process/mm and let the job handle things itself, either using
the task_work style scan that I've proposed, or just banning khugepaged,
period. Again, I don't think this is a very good way to make the
decision, but something to think about.
> BTW my thinking has been usually to actually use more khugepageds to
> scan large address spaces faster.
I hadn't thought of it, but I suppose that is an option as well. Unless
I've completely missed something in the code, I don't think there's a
way to do this now, right? Either way, I suppose it wouldn't be too hard
to do, but this still leaves the window wide open for allocations to be
made far away from where the process really needs them. Maybe if we had
a way to spin up a new khugepaged on the fly, so that users can pin it
where they want it, that would work? Just brainstorming here...
- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/