Re: serial percpu_ref draining in exit_aio()

From: Jens Axboe
Date: Thu Mar 19 2015 - 16:26:07 EST


On 03/19/2015 11:34 AM, Tejun Heo wrote:
Hello,

So, Jens noticed that fio process exiting takes seconds when there are
multiple aio contexts and the culprit seems to be the serial
percpu_ref draining in exit_aio(). It's generally a bad idea to
expose RCU latencies to userland because they add up really quickly
and are unrelated to other performance parameters. Can you guys
please at least update the code so that it waits for all percpu_refs
to drain at the same time rather than one after another? That should
resolve the worst part of the problem.

This works for me. Before:

real 0m5.872s
user 0m0.020s
sys 0m0.040s

after

real 0m0.246s
user 0m0.020s
sys 0m0.040s

It solves the exit_aio() issue, but if the app calls io_destroy(), then we are back to square one...

--
Jens Axboe

diff --git a/fs/aio.c b/fs/aio.c
index f8e52a1854c1..73b0de46577b 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -805,18 +805,35 @@ EXPORT_SYMBOL(wait_on_sync_kiocb);
void exit_aio(struct mm_struct *mm)
{
struct kioctx_table *table = rcu_dereference_raw(mm->ioctx_table);
+ struct completion *comp = NULL;
int i;

if (!table)
return;

+ if (table->nr > 1) {
+ comp = kmalloc(table->nr * sizeof(struct completion),
+ GFP_KERNEL);
+ if (comp)
+ for (i = 0; i < table->nr; i++)
+ init_completion(&comp[i]);
+ }
+
for (i = 0; i < table->nr; ++i) {
struct kioctx *ctx = table->table[i];
struct completion requests_done =
COMPLETION_INITIALIZER_ONSTACK(requests_done);

- if (!ctx)
+ /*
+ * Complete it early, so the below wait_for_completion()
+ * doesn't expect a complete() from the RCU callback
+ */
+ if (!ctx) {
+ if (comp)
+ complete(&comp[i]);
continue;
+ }
+
/*
* We don't need to bother with munmap() here - exit_mmap(mm)
* is coming and it'll unmap everything. And we simply can't,
@@ -825,10 +842,20 @@ void exit_aio(struct mm_struct *mm)
* that it needs to unmap the area, just set it to 0.
*/
ctx->mmap_size = 0;
- kill_ioctx(mm, ctx, &requests_done);
+ if (comp)
+ kill_ioctx(mm, ctx, &comp[i]);
+ else {
+ kill_ioctx(mm, ctx, &requests_done);
+ wait_for_completion(&requests_done);
+ }
+ }

- /* Wait until all IO for the context are done. */
- wait_for_completion(&requests_done);
+ if (comp) {
+ for (i = 0; i < table->nr; i++) {
+ /* Wait until all IO for the context are done. */
+ wait_for_completion(&comp[i]);
+ }
+ kfree(comp);
}

RCU_INIT_POINTER(mm->ioctx_table, NULL);