Re: [PATCH 2/2] mm/kmemleak: Fix UAF bug in kmemleak_scan()

From: Waiman Long
Date: Wed Dec 14 2022 - 10:57:33 EST


On 12/14/22 06:16, Catalin Marinas wrote:
On Sat, Dec 10, 2022 at 06:00:48PM -0500, Waiman Long wrote:
Commit 6edda04ccc7c ("mm/kmemleak: prevent soft lockup in first
object iteration loop of kmemleak_scan()") fixes soft lockup problem
in kmemleak_scan() by periodically doing a cond_resched(). It does
take a reference of the current object before doing it. Unfortunately,
if the object has been deleted from the object_list, the next object
pointed to by its next pointer may no longer be valid after coming
back from cond_resched(). This can result in use-after-free and other
nasty problem.
Ah, kmemleak_cond_resched() releases the rcu lock, so using
list_for_each_entry_rcu() doesn't help.

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 8c44f70ed457..d3a8fa4e3af3 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1465,15 +1465,26 @@ static void scan_gray_list(void)
* that the given object won't go away without RCU read lock by performing a
* get_object() if necessaary.
*/
-static void kmemleak_cond_resched(struct kmemleak_object *object)
+static void kmemleak_cond_resched(struct kmemleak_object **pobject)
{
- if (!get_object(object))
+ struct kmemleak_object *obj = *pobject;
+
+ if (!(obj->flags & OBJECT_ALLOCATED) || !get_object(obj))
return; /* Try next object */
I don't think we can rely on obj->flags without holding obj->lock. We do
have a few WARN_ON() checks without the lock but in all other places the
lock should be held.

Good point. It is just an optimistic check and it is OK to be wrong. I think I may need to use data_race() macro to signify that racing can happen and it is fine.


Another potential issue with re-scanning is that the loop may never
complete if it always goes from the beginning. Yet another problem with
restarting is that we may count references to an object multiple times
and get more false negatives.

I'd keep the OBJECT_ALLOCATED logic in the main kmemleak_scan() loop and
retake the object->lock if cond_resched() was called
(kmemleak_need_resched() returning true), check if it was freed and
restart the loop. We could add a new OBJECT_SCANNED flag so that we
skip such objects if we restarted the loop. The flag is reset during
list preparation.

I wonder whether we actually need the cond_resched() in the first loop.
It does take a lot of locks but it doesn't scan the objects. I had a
patch around to remove the fine-grained locking in favour of the big
kmemleak_lock, it would make this loop faster (not sure what happened to
that patch, I need to dig it out).

Thanks for the review. Another alternative way to handle that is to add an OBJECT_ANCHORED flag to indicate that this object shouldn't be deleted from the object list yet. Maybe also an OBJECT_DELETE_PENDING flag so that kmemleak_cond_resched() will delete it after returning from cond_resched() when set by another function that want to delete this object. All these checks and flag setting will be done with object lock held. How do you think?

Cheers,
Longman