RE: [RFC 1/3] /dev/low_mem_notify

From: leonid.moiseichuk
Date: Wed Jan 18 2012 - 04:43:57 EST


> -----Original Message-----
> From: penberg@xxxxxxxxx [mailto:penberg@xxxxxxxxx] On Behalf Of ext
> Pekka Enberg
> Sent: 18 January, 2012 11:16
...
> > Would be possible to not use percents for thesholds? Accounting in pages
> even
> > not so difficult to user-space.
>
> How does that work with memory hotplug?

Not worse than %%. For example you had 10% free memory threshold for 512 MB RAM meaning 51.2 MB in absolute number.
Then hotplug turned off 256 MB, you for sure must update threshold for %% because these 10% for 25.6 MB most likely will be not suitable for different operating mode.
Using pages makes calculations must simpler.

>
> On Wed, Jan 18, 2012 at 11:06 AM, <leonid.moiseichuk@xxxxxxxxx> wrote:
> > Also, looking on vmnotify_match I understand that events propagated to
> > user-space only in case threshold trigger change state from 0 to 1 but not
> > back, 1-> 0 is very useful event as well
(*)

> >
> > Would be possible to use for threshold pointed value(s) e.g. according to
> > enum zone_state_item, because kinds of memory to track could be
> different?
> > E.g. to tracking paging activity NR_ACTIVE_ANON and NR_ACTIVE_FILE
> could be
> > interesting, not only free.
>
> I don't think there's anything in the ABI that would prevent that.

If this statement also related my question (*) I have to point need to track attributes history, otherwise user-space will be constantly kicked with updates.

> I actually changed the ABI to look like this:
>
> struct vmnotify_event {
> /*
> * Size of the struct for ABI extensibility.
> */
> __u32 size;
>
> __u64 attrs;
>
> __u64 attr_values[];
> };
>
> So userspace can decide which fields to include in notifications.

Good. But how you can provide current status of attributes to user-space? Need to have read() call support to deliver all supported attr_values[] on demand.

> >> +
> >> +#ifdef CONFIG_SWAP
> >> + si_swapinfo(&si);
> >> + event.nr_swap_pages = si.totalswap;
> >> +#endif
> >> +
> >
> > Why not to use global_page_state() directly? si_meminfo() and especial
> > si_swapinfo are quite expensive call.
>
> Sure, we can do that. Feel free to send a patch :-).

When I see code because from emails it is quite difficult to understand.
For short-term I need to focus on integration "memnotify" version internally which is kind of work for me already and provides all required interfaces n9 needs.

Btw, when API starts to work with pointed thresholds logically it is not anymore low_mem_notify, you need to invent some other name.

> No idea what happens. The sampling code is just a proof of concept thing and
> I expect it to be buggy as hell. :-)
>
> Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/