Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready

From: Rafael J. Wysocki
Date: Thu Apr 26 2012 - 16:36:13 EST


On Thursday, April 26, 2012, NeilBrown wrote:
> On Sun, 22 Apr 2012 23:22:43 +0200 "Rafael J. Wysocki" <rjw@xxxxxxx> wrote:
>
> > From: Arve HjÃnnevÃg <arve@xxxxxxxxxxx>
> >
> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> > wakeup_source will be active to prevent suspend. This can be used to
> > handle wakeup events from a driver that support poll, e.g. input, if
> > that driver wakes up the waitqueue passed to epoll before allowing
> > suspend.
> >
> > The current implementation uses an extra wakeup_source when
> > ep_scan_ready_list runs. This can cause problems if a single thread
> > is polling on wakeup events and frequent non-wakeup events (events
> > usually arrive during thread freezing) using the same epoll file.
>
> This is quite neat.
>
> If I understand it correctly, you register file descriptors with epoll_ctl()
> on an fd created with epoll_create(), and set the new EPOLLWAKEUP flag.
> Then when a regular 'poll' or 'select' on the epoll fd reports that it is
> readable you:
> - get a wakelock
> - use epoll_wait to collect the events
> - process the events
> - release your wakelock
> - go back to poll() or select() on the epoll fd.
> Correct? As long as there are ready events with EPOLLWAKEUP set a
> wakeup_source is held active and the system won't go to sleep.
>
> My concern with this is about permissions. It appears that any process could
> wait of some fd (maybe a pipe they created themselves) with EPOLLWAKEUP, and
> then simply never epoll_wait() for the event. Then they would be keeping
> the system awake. I don't think that is acceptable.

I wonder what Arve has to say to that, but let me say that on systems without
autosleep every process can go into an infinite busy loop which is going to
drain battery relatively quickly just as well and I don't see why that's so
much different.

> So there needs to be some way to limit who can effectively block suspend by
> using EPOLLWAKEUP.
> (This is one of the reasons I like an all-user-space solution. Policy issues
> like this can easily be decided in user-space but are clumsy to put into the
> kernel).
>
> Also, I'm having trouble understanding the ep->ws wakeup_source.
> The epi->ws makes lots of sense and I think I understand it all.
> However I don't see why you need a wakeup_source for the 'struct eventpoll'.
>
> Every time that 'poll' decides to call the ->poll fop for the eventpoll, this
> wakeup_source will be activated and deactivated which will abort any current
> suspend cycle even if there are no events to report.
>
> I suspect it can just go away.

I'll leave this one entirely to Arve, if you don't mind. :-)

> One last item that doesn't really belong here - but it is in context.
>
> This mechanism is elegant because it provides a single implementation that
> provides wakeup_source for almost any sort of device. I would like to do the
> same thing for interrupts.
> Most (maybe all) of the wakeup device on my phone have an interrupt where the
> body is run in a thread. When the thread has done it's work the event is
> visible to userspace so the EPOLLWAKEUP mechanism is all that is needed to
> complete the path to user-space (or for my user-space solution, nothing else
> is needed once it is visible to user-space).
> So we just need to ensure a clear path from the "top half" interrupt handler
> to the threaded handler.
> So I imagine attaching a wakeup source to every interrupt for which 'wakeup'
> is enabled, activating it when the top-half starts and relaxing it when the
> bottom-half completes. With this in place, almost all drivers would get
> wakeup_source handling for free.
> Does this seem reasonable to you.

Yes, it does.

Wakeup devices have their own wakeup source objects anyway, so perhaps they may
be used for this purpose somehow (just wondering).

> I'm afraid I don't have code yet, but hope to find time in a few weeks.
>
> One difficulty with that is that I have noticed a number of drivers that
> potentially enable_irq_wake just before suspend and disable_irq_wake
> immediately after (e.g. gpio_keys.c). Allocating a wakeup_source on each
> enable_irq_wake would be an unfortunate overhead. Maybe we just allocate it
> the first time enable_irq_wake is called ....

I guess we can do something in analogy with device_wakeup_enable()?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/