Re: [syzbot] WARNING in c_start

From: Linus Torvalds
Date: Sun Oct 16 2022 - 13:53:25 EST


On Sat, Oct 15, 2022 at 6:12 PM Yury Norov <yury.norov@xxxxxxxxx> wrote:
>
> On Sun, Oct 16, 2022 at 09:24:57AM +0900, Tetsuo Handa wrote:
> >
> > Again, please do revert "cpumask: fix checking valid cpu range" immediately.
>
> The revert is already in Jakub's batch for -rc2, AFAIK.

Hmm.

I've looked at this, and at the discussion, and the various reports,
and my gut feel is that the problem is that the whole
"cpumask_check()" is completely bogus for all the "starting at bit X"
cases.

And I think it was wrong even before, and yes, I think the "+1"
simplification just made things worse.

I think that where it makes sense is in contexts where we actually
*use* the bit value as a bit, so cpumask_clear_cpu() doing

clear_bit(cpumask_check(cpu), cpumask_bits(dstp));

makes 100% sense and is unequivocally something that merits a warning.
An out-of-range cpu number is a serious bug in that context.

But all the cases where the function fundamentally already limits
things to the number of CPU's (with comments like "Returns >=
nr_cpu_ids if no further cpus unset.") should simply not have the
cpumask_check() at all.

All it results in just moving the onus of testing things into the
callers, or just makes for odd complications ("-1 is valid, because it
acts as the previous cpu for the beginning because we add one to get
the next CPU").

Anyway, since rc1 is fairly imminent, I will just revert it for now -
I don't want to have a pending revert wait until -rc2.

But I actually suspect that the thing we should really do is to just
remove the check entirely for these functions that are already defined
in terms of "if no more bits, return nr_cpu_ids". They already
basically return an error case, having them *warn* about the error
they are going to return is just obnoxious.

Linus