Re: [PATCH 3/3] mm/kmemleak: Prevent soft lockup in first object iteration loop of kmemleak_scan()

From: Waiman Long
Date: Tue Jun 14 2022 - 14:29:25 EST



On 6/14/22 14:22, Waiman Long wrote:
On 6/14/22 13:27, Catalin Marinas wrote:

raw_spin_unlock_irq(&object->lock);
+
+        /*
+         * With object pinned by a positive reference count, it
+         * won't go away and we can safely release the RCU read
+         * lock and do a cond_resched() to avoid soft lockup every
+         * 64k objects.
+         */
+        if (object_pinned && !(gray_list_cnt & 0xffff)) {
+            rcu_read_unlock();
+            cond_resched();
+            rcu_read_lock();
+        }
I'm not sure this gains much. There should be very few gray objects
initially (those passed to kmemleak_not_leak() for example). The
majority should be white objects.

If we drop the fine-grained object->lock, we could instead take
kmemleak_lock outside the loop with a cond_resched_lock(&kmemleak_lock)
within the loop. I think we can get away with not having an
rcu_read_lock() at all for list traversal with the big lock outside the
loop.
Actually this doesn't work is the current object in the iteration is
freed. Does list_for_each_rcu_safe() help?

list_for_each_rcu_safe() helps if we are worrying about object being freed. However, it won't help if object->next is freed instead.

How about something like:

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 7dd64139a7c7..fd836e43cb16 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1417,12 +1417,16 @@ static void kmemleak_scan(void)
        struct zone *zone;
        int __maybe_unused i;
        int new_leaks = 0;
+       int loop1_cnt = 0;

        jiffies_last_scan = jiffies;

        /* prepare the kmemleak_object's */
        rcu_read_lock();
        list_for_each_entry_rcu(object, &object_list, object_list) {
+               bool obj_pinned = false;
+
+               loop1_cnt++;
                raw_spin_lock_irq(&object->lock);
 #ifdef DEBUG
                /*
@@ -1437,10 +1441,32 @@ static void kmemleak_scan(void)
 #endif
                /* reset the reference count (whiten the object) */
                object->count = 0;
-               if (color_gray(object) && get_object(object))
+               if (color_gray(object) && get_object(object)) {
                        list_add_tail(&object->gray_list, &gray_list);
+                       obj_pinned = true;
+               }

                raw_spin_unlock_irq(&object->lock);
+
+               /*
+                * Do a cond_resched() to avoid soft lockup every 64k objects.
+                * Make sure a reference has been taken so that the object
+                * won't go away without RCU read lock.
+                */
+               if (loop1_cnt & 0xffff) {


Sorry, should be "(!(loop1_cnt & 0xffff))".

+ if (!obj_pinned && !get_object(object)) {
+                               /* Try the next object instead */
+                               loop1_cnt--;
+                               continue;
+                       }
+
+                       rcu_read_unlock();
+                       cond_resched();
+                       rcu_read_lock();
+
+                       if (!obj_pinned)
+                               put_object(object);
+               }
        }
        rcu_read_unlock();

Cheers,
Longman