Re: [RFC PATCH v15 2/7] locking/mutex: Rework task_struct::blocked_on

From: Google
Date: Tue Mar 18 2025 - 10:11:27 EST


On Thu, 13 Mar 2025 23:12:57 -0700
John Stultz <jstultz@xxxxxxxxxx> wrote:

> On Thu, Mar 13, 2025 at 3:14 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > FYI, this is useful for Masami's "hung task" work that will show what
> > tasks a hung task is blocked on in a crash report.
> >
> > https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> >
>
> Ah. Indeed, we have similar use cases. There's some slight difference
> in when we consider the task blocked, especially in this early patch
> (as waking tasks mark us as unblocked so we can be selected to run).
> But later on in the series (in the portions I've not yet submitted
> here) when the blocked_on_state has been introduced, the blocked_on
> value approximates to about the same spot as used here.

Interesting. Can yo also track tasks which takes other locks like
rwsem/semaphore ? Lance is also working on this to expand it to
support semaphore.

https://lore.kernel.org/all/20250314144300.32542-1-ioworker0@xxxxxxxxx/

Please add them for the next version.

>
> So I should be able to unify these. It looks like Masami's patch is
> close to being queued, so maybe I'll incorporate it into my series and
> rework my set ontop. Any objections to this?

No :) Please Cc to me.


BTW, I had a chat with Suleiman and he suggested me to expand
this idea to record what locks the task takes. Then we can search
all tasks who is holding the lock. Something like,

struct task_struct {
unsigned long blocking_on;
unsigned long holding_locks[HOLDING_LOCK_MAX];
unsigned int holding_idx;
};

lock(lock_addr) {
if (succeeded_to_lock) {
current->holding_locks[current->holding_idx++] = lock_addr;
} else {
record_blocking_on(current, lock_addr)
wait_for_lock();
clear_blocking_on(current, lock_addr)
}
}

unlock(lock_addr) {
current->holding_locks[--current->holding_idx] = 0UL;
}

And when we found a hung task, call dump_blocker() like this;

dump_blocker() {
lock_addr = hung_task->blocking_on;
for_each_task(task) {
if (find_lock(task->holding_locks, lock_addr)) {
dump_task(task);
/* semaphore, rwsem will need to dump all holders. */
if (lock is mutex)
break;
}
}
}

This can be too much but interesting idea to find semaphore type blocker.

Thank you,

>
> thanks
> -john


--
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>