[PATCH v3 2/3] epoll: restrict wakeups to the overflow list

From: Jason Baron
Date: Tue Feb 24 2015 - 16:25:56 EST


During ep_scan_ready_list(), when the ep->mtx is dropped, we queue new
events to the ep->ovflist. However, instead of just issuing wakeup for these
newly encountered events, we instead proceed to issue wakeups even if
nothing new is being propagated.

Normally, this simply results in unnecessary calls to wakeup. However,
now that we want to add wakeup queues that have 'state', this results in
unnecessary state transitions. That is, with the current default behavior
of always waking up all threads, the extra calls to wakeup do not affect
things adversely (besides the extra call overheads). However, we wish to
add policies that are stateful (for example rotating wakeups among epoll
sets), and these unnecessary wakeups cause unwanted transitions.

Signed-off-by: Jason Baron <jbaron@xxxxxxxxxx>
---
fs/eventpoll.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index d77f944..da84712 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -594,7 +594,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
struct list_head *, void *),
void *priv, int depth, bool ep_locked)
{
- int error, pwake = 0;
+ int error, pwake = 0, newly_ready = 0;
unsigned long flags;
struct epitem *epi, *nepi;
LIST_HEAD(txlist);
@@ -634,6 +634,13 @@ static int ep_scan_ready_list(struct eventpoll *ep,
for (nepi = ep->ovflist; (epi = nepi) != NULL;
nepi = epi->next, epi->next = EP_UNACTIVE_PTR) {
/*
+ * We only need to perform wakeups if new events have arrived
+ * while the ep->lock was dropped. We should have already
+ * issued the wakeups for an existing events.
+ */
+ if (!newly_ready)
+ newly_ready = 1;
+ /*
* We need to check if the item is already in the list.
* During the "sproc" callback execution time, items are
* queued into ->ovflist but the "txlist" might already
@@ -657,7 +664,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
list_splice(&txlist, &ep->rdllist);
__pm_relax(ep->ws);

- if (!list_empty(&ep->rdllist)) {
+ if (newly_ready) {
/*
* Wake up (if active) both the eventpoll wait list and
* the ->poll() wait list (delayed after we release the lock).
--
1.8.2.rc2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/