RE: [PATCH -v2 -mm] add extra free kbytes tunable
From: Satoru Moriya
Date: Wed Oct 26 2011 - 15:00:09 EST
emaOn 10/25/2011 05:50 PM, David Rientjes wrote:
> On Mon, 24 Oct 2011, Satoru Moriya wrote:
>>>> We do.
>>>> Basically we need this kind of feature for almost all our latency
>>>> sensitive applications to avoid latency issue in memory allocation.
>>> These are all realtime?
>> Do you mean that these are all realtime process?
>> If so, answer is depending on the situation. In the some situations,
>> we can set these applications as rt-task. But the other situation,
>> e.g. using some middlewares, package softwares etc, we can't set them
>> as rt-task because they are not built for running as rt-task. And also
>> it is difficult to rebuilt them for working as rt-task because they
>> usually have huge code base.
> If this problem affects processes that aren't realtime, then your only
> option is to increase /proc/sys/vm/min_free_kbytes. It's unreasonable to
> believe that the VM should be able to reclaim in the background at the
> same rate that an application is allocating huge amounts of memory without
> allowing there to be a buffer. Adding another tunable isn't going to
> address that situation better than min_free_kbytes.
Even if allocating memory in user space causes latency issues, usually
allocation itself doesn't continue for a long time. Therefore if we
can keep enough free memory, we can avoid latency issue in this situation.
min_free_kbytes makes min wmark bigger too. It means that the amount of
memory user processes can use without penalty(direct reclaim) decrease
unnecessarily, this is what we'd like to avoid.
>> As I reported another mail, changing kswapd priority does not mitigate
>> even my simple testcase very much. Of course, reclaiming above the high
>> wmark may solve the issue on some workloads but if an application can
>> allocate memory more than high wmark - min wmark which is extended and
>> fast enough, latency issue will happen.
>> Unless this latency concern is fixed, customers doesn't use vanilla
> And you have yet to provide an expression that shows what a sane setting
> for this tunable will be. In fact, it seems like you're just doing trial
> and error and finding where it works pretty well for a certain VM
> implementation in a certain kernel. That's simply not a maintainable
> userspace interface!
Try and error is tuning itself. When we tune a system, we usually set
some knobs, run some benchmarks/tests/etc., evaluate results and
decide which is the best configuration.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/