Re: [PATCH 0/2] introduce __next_thread(), change next_thread()
From: Oleg Nesterov
Date: Thu Aug 24 2023 - 11:49:50 EST
On 08/24, Linus Torvalds wrote:
>
> On Thu, 24 Aug 2023 at 07:32, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > After document-while_each_thread-change-first_tid-to-use-for_each_thread.patch
> > in mm tree + this series
>
> Looking at your patch 2/2, I started looking at users ("Maybe we
> *want* NULL for the end case, and make next_thread() and __next_thread
> be the same?").
Yes, but see below.
> One of the main users is while_each_thread(), which certainly wants
> that NULL case, both for an easier loop condition,
No. Please note that, say,
do {
do_something(t);
} while_each_thread(current, t);
differs from for_each_thread() in that it loops starting from current,
not current->parent. I guess in most cases the order doesn't matter,
and I am going to audit the users and change them to use
for_each_thread() when possible.
Or,
while_each_thread(current, t)
do_something(t);
means do_something for every thread except current. And this have a
couple of valid users (say, zap_other_threads), but perhaps we can
change them too.
> but also because
> the only user that uses the 't' pointer after the loop is
> fs/proc/base.c, which wants it to be NULL.
Do you mean first_tid() ? Not only it is the only user that uses
the 't' pointer after the loop, it is the only user of lockless
while_each_thread() which (in general) is NOT rcu-safe.
But I have already changed it to use for_each_thread(), see
https://lore.kernel.org/all/20230823170806.GA11724@xxxxxxxxxx/
This is
document-while_each_thread-change-first_tid-to-use-for_each_thread.patch
in mm tree.
> And kernel/bpf/task_iter.c seems to *expect* NULL at the end?
Yes! I think the same and I even documented this in 1/2.
To me this code looks simply wrong, but so far I don't understand
it enough. Currently I am trying to push the initial cleanups into
this code. See the
https://lore.kernel.org/all/20230821150909.GA2431@xxxxxxxxxx/
thread.
> End result: if you're changing next_thread() anyway, please just
> change it to be a completely new thing that returns NULL at the end,
See above.
I'd prefer to audit/change the current users of while_each_thread()
and next_thread(), then (perhaps) kill while_each_thread() and/or
next_thread().
Oleg.