On Mon, Mar 10, 2014 at 04:15:33PM +0800, Tang Chen wrote:
IO ring page migration has been implemented by the following patch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/aio.c?id=36bc08cc01709b4a9bb563b35aa530241ddc63e3
In this patch, ctx->completion_lock is used to prevent other processes
from accessing the ring page being migrated.
But in aio_setup_ring(), ioctx_add_table() and aio_read_events_ring(),
when writing to the ring page, they didn't take ctx->completion_lock.
As a result, for example, we have the following problem:...
As above, the new ring page will not be updated.
The solution is taking ctx->completion_lock in thread 2, which means,
in aio_setup_ring(), ioctx_add_table() and aio_read_events_ring() when
writing to ring pages.
Upon review, there are still two accesses of ->ring_pages that are not
protected by any spinlocks which could potentially race with migration. One
is in aio_setup_ring(), which can be easily resolved by moving the assignment
of ->ring_pages above the unlock_page().
Another spot is in
aio_read_events_ring() where head and tail are fetched from the ring without
any locking. I also fear we'll be introducing new performance issues with
all the additonal spinlock bouncing, despite the fact that is only ever
needed for migration. I'm going to continue looking into this today and
will try to send out a followup to this email later.
--
-ben