[PATCH] make idr_remove_all() do removal -before- free_layer()

From: Paul E. McKenney
Date: Sat Mar 07 2009 - 17:25:27 EST


The following patch fixes a problem in the IDR system, where an
idr_remove_all() hands a data element to call_rcu() (via free_layer())
before making that data element inaccessible to new readers. This is
very bad, and results in readers still having a reference to this data
element at the end of the grace period. Tests on large machines that
concurrently map and unmap user-space memory within the same multithreaded
process result in crashes within about five minutes. Applying this
patch increases the kernel's longevity to the three-to-eight-hour range.

There appear to be other similar problems in idr_get_empty_slot() and
sub_remove(), but I fixed the easy one in idr_remove_all() first. It is
therefore no surprise that failures still occur.

(Yes, and I did look at the relevant patch last year without spotting
this one. Goes to show the value of testing as well as code review,
I guess...)

Nadia, Manfred, any thoughts?

Located-by: Milton Miller II <miltonm@xxxxxxxxxxxxxx>
Tested-by: Milton Miller II <miltonm@xxxxxxxxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
---

diff --git a/lib/idr.c b/lib/idr.c
index c11c576..dab4bca 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -449,6 +449,7 @@ void idr_remove_all(struct idr *idp)

n = idp->layers * IDR_BITS;
p = idp->top;
+ rcu_assign_pointer(idp->top, NULL);
max = 1 << n;

id = 0;
@@ -467,7 +468,6 @@ void idr_remove_all(struct idr *idp)
p = *--paa;
}
}
- rcu_assign_pointer(idp->top, NULL);
idp->layers = 0;
}
EXPORT_SYMBOL(idr_remove_all);

----- End forwarded message -----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/