Re: [PATCH -next 0/2] fs/epoll: loosen irq safety when possible

From: Davidlohr Bueso
Date: Fri Jul 20 2018 - 16:06:13 EST


On Fri, 20 Jul 2018, Andrew Morton wrote:

On Fri, 20 Jul 2018 10:29:54 -0700 Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:

Hi,

Both patches replace saving+restoring interrupts when taking the
ep->lock (now the waitqueue lock), with just disabling local irqs.
This shows immediate performance benefits in patch 1 for an epoll
workload running on Xen.

I'm surprised. Is spin_lock_irqsave() significantly more expensive
than spin_lock_irq()? Relative to all the other stuff those functions
are doing? If so, how come? Some architectural thing makes
local_irq_save() much more costly than local_irq_disable()?

For example, if you compare x86 native_restore_fl() to xen_restore_fl(),
the cost of Xen is much higher.

And at least considering ep_scan_ready_list(), the lock is taken/released
twice, to deal with the ovflist when the ep->wq.lock is not held. To the
point that it yields measurable results (see patch 1) across incremental
thread counts.


The main concern we need to have with this
sort of changes in epoll is the ep_poll_callback() which is passed
to the wait queue wakeup and is done very often under irq context,
this patch does not touch this call.

Yeah, these changes are scary. For the code as it stands now, and for
the code as it evolves.

Yes which is why I've been throwing lots of epoll workloads at it.


I'd have more confidence if we had some warning mechanism if we run
spin_lock_irq() when IRQs are disabled, which is probably-a-bug. But
afaict we don't have that. Probably for good reasons - I wonder what
they are?

Patches have been tested pretty heavily with the customer workload,
microbenchmarks, ltp testcases and two high level workloads that
use epoll under the hood: nginx and libevent benchmarks.

Details are in the individual patches.

Applies on top of mmotd.

Please convince me about the performance benefits?

As for number I only have patch 1.

Thanks,
Davidlohr