Re: [PATCH v5 0/3] mm, proc: Implement /proc/<pid>/totmaps

From: Sonny Rao
Date: Mon Sep 19 2016 - 16:16:01 EST


On Mon, Sep 19, 2016 at 12:56 PM, Jann Horn <jann@xxxxxxxxx> wrote:
> On Mon, Sep 19, 2016 at 09:51:13PM +0200, Michal Hocko wrote:
>> [not sure why the CC list was trimmed - do no do that please unless you
>> have a strong reason for that - if this was not intentional please
>> restpre it]
>
> Ah, sorry, pressed the wrong key.
>
>
>> On Mon 19-09-16 21:40:01, Jann Horn wrote:
>> > On Mon, Sep 19, 2016 at 09:32:38PM +0200, Michal Hocko wrote:
>> > > On Mon 19-09-16 11:16:31, Robert Foss wrote:
>> > > > On 2016-09-14 05:12 AM, Michal Hocko wrote:
>> > > > > On Tue 13-09-16 13:27:39, Sonny Rao wrote:
>> > > [...]
>> > > > > > Given that smaps
>> > > > > > doesn't provide this in a straightforward way, what do you think is
>> > > > > > the right way to provide this information?
>> > > > >
>> > > > > I would be tempted to sneak it into /proc/<pid>/statm because that looks
>> > > > > like a proper place but getting this information is not for free
>> > > > > performance wise so I am not really sure something that relies on this
>> > > > > file would see unexpected stalls. Maybe this could be worked around by
>> > > > > some caching... I would suggest to check who is actually using this file
>> > > > > (top/ps etc...)
>> > > >
>> > > > What would this caching look like? Can any information be re-used between
>> > > > vma walks?
>> > >
>> > > yes basically return the same value if called within HZ or something
>> > > similar. But that assumes that statm latency really matters and it is
>> > > called often enough.
>> >
>> > That sounds horrible. If some application decides that they want to check
>> > statm directly after some action or so (like after program startup), this is
>> > going to give them a very bad time. That probably doesn't happen
>> > often - but still.
>> >
>> > I can already imagine some developer going "yeah, that usleep()... that's
>> > because the kernel API returns stale information for a couple milliseconds
>> > after we do something *shrug*".
>> >
>> > What are you trying to optimize for? Ten users on the same machine, each of
>> > which is running "top" because it looks so great?
>>
>> Please try to read what I wrote again. I didn't say this would be
>> needed. The idea was that _if_ /proc/<pid>/statm is used very _often_
>> than some caching might help to reduce the overhead. Especially when you
>> consider that the information is not precise anyway. It can change
>> anytime while you are doing the address space walk.

Just thinking out loud here -- I haven't looked closely at the code so
please bear with me :-)

Instead of checking when the last read was and returning old data,
what about a scheme where we still have a timestamp for last stat read
on and any changes to that address space invalidate the timestamp.

The invalidation could be racy because we're not too concerned about
immediate accuracy -- so just a write. The main issue I could see
which this is that it could cause the cacheline holding this timestamp
to bounce around a lot? Maybe there's an existing solution in the
page table locking that could be leveraged here to at least maintain
whatever scalability enhancements are present for this type of
situation where there are many updates happening in parallel.