Re: [PATCH 0/8] mm: add page cache limit and reclaim feature
From: Xishi Qiu
Date: Sun Jun 22 2014 - 22:06:52 EST
On 2014/6/20 23:32, Michal Hocko wrote:
> On Fri 20-06-14 15:56:56, Xishi Qiu wrote:
>> On 2014/6/17 9:35, Xishi Qiu wrote:
>>
>>> On 2014/6/16 20:50, Rafael Aquini wrote:
>>>
>>>> On Mon, Jun 16, 2014 at 01:14:22PM +0200, Michal Hocko wrote:
>>>>> On Mon 16-06-14 17:24:38, Xishi Qiu wrote:
>>>>>> When system(e.g. smart phone) running for a long time, the cache often takes
>>>>>> a large memory, maybe the free memory is less than 50M, then OOM will happen
>>>>>> if APP allocate a large order pages suddenly and memory reclaim too slowly.
>>>>>
>>>>> Have you ever seen this to happen? Page cache should be easy to reclaim and
>>>>> if there is too mach dirty memory then you should be able to tune the
>>>>> amount by dirty_bytes/ratio knob. If the page allocator falls back to
>>>>> OOM and there is a lot of page cache then I would call it a bug. I do
>>>>> not think that limiting the amount of the page cache globally makes
>>>>> sense. There are Unix systems which offer this feature but I think it is
>>>>> a bad interface which only papers over the reclaim inefficiency or lack
>>>>> of other isolations between loads.
>>>>>
>>>> +1
>>>>
>>>> It would be good if you could show some numbers that serve as evidence
>>>> of your theory on "excessive" pagecache acting as a trigger to your
>>>> observed OOMs. I'm assuming, by your 'e.g', you're running a swapless
>>>> system, so I would think your system OOMs are due to inability to
>>>> reclaim anon memory, instead of pagecache.
>>>>
>>
>> I asked some colleagues, when the cache takes a large memory, it will not
>> trigger OOM, but performance regression.
>>
>> It is because that business process do IO high frequency, and this will
>> increase page cache. When there is not enough memory, page cache will
>> be reclaimed first, then alloc a new page, and add it to page cache. This
>> often takes too much time, and causes performance regression.
>
> I cannot say I would understand the problem you are describing. So the
> page cache eats the most of the memory and that increases allocation
> latency for new page cache? Is it because of the direct reclaim?
Yes, allocation latency causes performance regression.
A user process produces page cache frequently, so free memory is not
enough after running a long time. Slow path takes much more time because
direct reclaim. And kswapd will reclaim memory too, but not much. Thus it
always triggers slow path. this will cause performance regression.
Thanks,
Xishi Qiu
> Why kswapd doesn't reclaim the clean pagecache? Or is the memory dirty?
>
>> In view of this situation, if we reclaim page cache in circles may be
>> fix this problem. What do you think?
>
> No, it seems more like either system misconfiguration or a reclaim bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/