RE: [PATCH] zcache: avoid AB-BA deadlock condition

From: Dan Magenheimer
Date: Mon Feb 27 2012 - 12:24:03 EST


> From: Andrea Righi [mailto:andrea@xxxxxxxxxxxxxxx]
> Sent: Monday, February 20, 2012 5:12 AM
> To: Greg Kroah-Hartman
> Cc: Dan Magenheimer; Seth Jennings; devel@xxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> stable@xxxxxxxxxx
> Subject: [PATCH] zcache: avoid AB-BA deadlock condition
>
> Commit 9256a47 fixed a deadlock condition, being sure that the buddy
> list spinlock is always taken before the page spinlock.
>
> However in zbud_free_and_delist() locking order is the opposite
> (page lock -> list lock).
>
> Possible unsafe locking scenario (reported by lockdep):
>
> CPU0 CPU1
> ---- ----
> lock(&(&zbpg->lock)->rlock);
> lock(zbud_budlists_spinlock);
> lock(&(&zbpg->lock)->rlock);
> lock(zbud_budlists_spinlock);
>
> Fix by grabbing the locks in opposite order in zbud_free_and_delist().
>
> Signed-off-by: Andrea Righi <andrea@xxxxxxxxxxxxxxx>

Acked-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

Thanks for catching this Andrea! (And thanks also to
Alex Vallacis-Lasso for independently reporting and testing:
http://permalink.gmane.org/gmane.linux.kernel/1257214 )

Greg, this patch could be targeted for 3.3-rc6 and 3.2-stable.
AFAIK, nobody has actually experienced a deadlock from this so
if Linus has the screws down tight for -rc6, it could wait
until the 3.4 window.

> ---
> drivers/staging/zcache/zcache-main.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
> index ef7c52b..dce04be 100644
> --- a/drivers/staging/zcache/zcache-main.c
> +++ b/drivers/staging/zcache/zcache-main.c
> @@ -299,10 +299,12 @@ static void zbud_free_and_delist(struct zbud_hdr *zh)
> struct zbud_page *zbpg =
> container_of(zh, struct zbud_page, buddy[budnum]);
>
> + spin_lock(&zbud_budlists_spinlock);
> spin_lock(&zbpg->lock);
> if (list_empty(&zbpg->bud_list)) {
> /* ignore zombie page... see zbud_evict_pages() */
> spin_unlock(&zbpg->lock);
> + spin_unlock(&zbud_budlists_spinlock);
> return;
> }
> size = zbud_free(zh);
> @@ -310,7 +312,6 @@ static void zbud_free_and_delist(struct zbud_hdr *zh)
> zh_other = &zbpg->buddy[(budnum == 0) ? 1 : 0];
> if (zh_other->size == 0) { /* was unbuddied: unlist and free */
> chunks = zbud_size_to_chunks(size) ;
> - spin_lock(&zbud_budlists_spinlock);
> BUG_ON(list_empty(&zbud_unbuddied[chunks].list));
> list_del_init(&zbpg->bud_list);
> zbud_unbuddied[chunks].count--;
> @@ -318,7 +319,6 @@ static void zbud_free_and_delist(struct zbud_hdr *zh)
> zbud_free_raw_page(zbpg);
> } else { /* was buddied: move remaining buddy to unbuddied list */
> chunks = zbud_size_to_chunks(zh_other->size) ;
> - spin_lock(&zbud_budlists_spinlock);
> list_del_init(&zbpg->bud_list);
> zcache_zbud_buddied_count--;
> list_add_tail(&zbpg->bud_list, &zbud_unbuddied[chunks].list);
> --
> 1.7.5.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/