Re: [PATCH] delayacct: track delays from ksm cow
From: David Hildenbrand
Date: Fri Mar 18 2022 - 04:24:55 EST
On 18.03.22 02:41, CGEL wrote:
> On Thu, Mar 17, 2022 at 11:05:22AM +0100, David Hildenbrand wrote:
>> On 17.03.22 10:48, CGEL wrote:
>>> On Thu, Mar 17, 2022 at 09:17:13AM +0100, David Hildenbrand wrote:
>>>> On 17.03.22 03:03, CGEL wrote:
>>>>> On Wed, Mar 16, 2022 at 03:56:23PM +0100, David Hildenbrand wrote:
>>>>>> On 16.03.22 14:34, cgel.zte@xxxxxxxxx wrote:
>>>>>>> From: Yang Yang <yang.yang29@xxxxxxxxxx>
>>>>>>>
>>>>>>> Delay accounting does not track the delay of ksm cow. When tasks
>>>>>>> have many ksm pages, it may spend a amount of time waiting for ksm
>>>>>>> cow.
>>>>>>>
>>>>>>> To get the impact of tasks in ksm cow, measure the delay when ksm
>>>>>>> cow happens. This could help users to decide whether to user ksm
>>>>>>> or not.
>>>>>>>
>>>>>>> Also update tools/accounting/getdelays.c:
>>>>>>>
>>>>>>> / # ./getdelays -dl -p 231
>>>>>>> print delayacct stats ON
>>>>>>> listen forever
>>>>>>> PID 231
>>>>>>>
>>>>>>> CPU count real total virtual total delay total delay average
>>>>>>> 6247 1859000000 2154070021 1674255063 0.268ms
>>>>>>> IO count delay total delay average
>>>>>>> 0 0 0ms
>>>>>>> SWAP count delay total delay average
>>>>>>> 0 0 0ms
>>>>>>> RECLAIM count delay total delay average
>>>>>>> 0 0 0ms
>>>>>>> THRASHING count delay total delay average
>>>>>>> 0 0 0ms
>>>>>>> KSM count delay total delay average
>>>>>>> 3635 271567604 0ms
>>>>>>>
>>>>>>
>>>>>> TBH I'm not sure how particularly helpful this is and if we want this.
>>>>>>
>>>>> Thanks for replying.
>>>>>
>>>>> Users may use ksm by calling madvise(, , MADV_MERGEABLE) when they want
>>>>> save memory, it's a tradeoff by suffering delay on ksm cow. Users can
>>>>> get to know how much memory ksm saved by reading
>>>>> /sys/kernel/mm/ksm/pages_sharing, but they don't know what the costs of
>>>>> ksm cow delay, and this is important of some delay sensitive tasks. If
>>>>> users know both saved memory and ksm cow delay, they could better use
>>>>> madvise(, , MADV_MERGEABLE).
>>>>
>>>> But that happens after the effects, no?
>>>>
>>>> IOW a user already called madvise(, , MADV_MERGEABLE) and then gets the
>>>> results.
>>>>
>>> Image user are developing or porting their applications on experiment
>>> machine, they could takes those benchmark as feedback to adjust whether
>>> to use madvise(, , MADV_MERGEABLE) or it's range.
>>
>> And why can't they run it with and without and observe performance using
>> existing metrics (or even application-specific metrics?)?
>>
>>
> I think the reason why we need this patch, is just like why we need
> swap,reclaim,thrashing getdelay information. When system is complex,
> it's hard to precise tell which kernel activity impact the observe
> performance or application-specific metrics, preempt? cgroup throttle?
> swap? reclaim? IO?
>
> So if we could get the factor's precise impact data, when we are tunning
> the factor(for this patch it's ksm), it's more efficient.
>
I'm not convinced that we want to make or write-fault handler more
complicated for such a corner case with an unclear, eventual use case.
IIRC, whenever using KSM you're already agreeing to eventually pay a
performance price, and the price heavily depends on other factors in the
system. Simply looking at the number of write-faults might already give
an indication what changed with KSM being enabled.
Having that said, I'd like to hear other opinions.
--
Thanks,
David / dhildenb