RE: [RFC 1/3] /dev/low_mem_notify

From: leonid.moiseichuk
Date: Thu Jan 19 2012 - 06:55:51 EST


> -----Original Message-----
> From: penberg@xxxxxxxxx [mailto:penberg@xxxxxxxxx] On Behalf Of ext
> Pekka Enberg
> Sent: 19 January, 2012 13:08
...
> > 1. rename this API from low_mem_pressure to something more related to
> > notification and memory situation in system: memory_pressure,
> > memnotify, memory_level etc. The word "low" is misleading here
>
> The thing is called vmevent:

Yes, I see it. But I was a bit confused with vmnotify_fops and was sure it is mapped through dev. Now it anonymous inode.

>
> On Thu, Jan 19, 2012 at 12:53 PM, <leonid.moiseichuk@xxxxxxxxx> wrote:
> > 2. API must use deferred timers to prevent use-time impact. Deferred
> > timer will be triggered only in case HW event or non-deferrable timer,
> > so if device sleeps timer might be skipped and that is what expected
> > for user-space
>
> I'm currently looking at the possibility of hooking VM events to perf which
> also uses hrtimers. Can't we make hrtimers do the right thing?

I had no answer for this question. According to hrtimer_cpu_notify the cpu state is tracked but timer may set HW event to wake up.
In this case use-time will be affected due to you will have too much HW events and reasons to wakeup.
At least powertop reports hrtimers in relation to <kernel core> as an activities sources.

>
> On Thu, Jan 19, 2012 at 12:53 PM, <leonid.moiseichuk@xxxxxxxxx> wrote:
> > 3. API should be tunable for propagate changes when level is Up or
> > Down, maybe both ways.
>
> Agreed.
>
> On Thu, Jan 19, 2012 at 12:53 PM, <leonid.moiseichuk@xxxxxxxxx> wrote:
> > 4. to avoid triggering too much events probably has sense to filter
> > according to amount of change but that is optional. If subscriber set
> > timer to 1s the amount of events should not be very big.
>
> Agreed.
>
> On Thu, Jan 19, 2012 at 12:53 PM, <leonid.moiseichuk@xxxxxxxxx> wrote:
> > 5. API must provide interface to request parameters e.g. available
> > swap or free memory just to have some base.
>
> The current ABI already supports that. You can specify which attributes
> you're interested in and they will be delivered as part of th event.

But you have in vmnotify.h suspicious free_pages_threshold field.

>
> On Thu, Jan 19, 2012 at 12:53 PM, <leonid.moiseichuk@xxxxxxxxx> wrote:
> > 6. I do not understand how work with attributes performed ( ) but it
> > has sense to use mask and fill requested attributes using mask and
> > callback table i.e. if free pages requested - they are reported, otherwise
> not.
>
> That's how it works now in the git tree.

Vmnotify.c has vmnotify_watch_event which collects fixed set of parameters.

> I'm currently looking at how to support Minchan's non-sampled events. It
> seems to me integrating with perf would be nice because we could simply
> use tracepoints for this.

If tracepoints not jeopardize use time has sense to do it.

>
> Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/