Re: BUG_ON(!cpus_equal(cpumask, tmp));

From: Martin J. Bligh
Date: Tue Mar 30 2004 - 20:27:34 EST


--On Tuesday, March 30, 2004 17:11:04 -0800 Andrew Morton <akpm@xxxxxxxx> wrote:

> "Martin J. Bligh" <mbligh@xxxxxxxxxxx> wrote:
>>
>> I made a similar patch, but I don't see how we can really fix it without
>> providing locking on cpu_online_map.
>
> Are we missing something here?
>
> Why does, for example, smp_send_reschedule() not have the same problem?
> Because we've gone around and correctly removed all references to the CPU
> from the scheduler data structures before offlining it.
>
> But we're not doing that in the mm code, right? Should we not be taking
> mmlist_lock and running around knocking this CPU out of everyone's
> cpu_vm_mask before offlining it?

I think we're assuming that we don't have to because the problem is fixed
by the "cpus_and(tmp, cpumask, cpu_online_map)" in flush_tlb_others so we
don't have to. Except it's racy, and doesn't work.

It would seem to me that your suggestion would fix it. But isn't locking
cpu_online_map both simpler and (most importantly) more generic? I can't
imagine that we don't use this elsewhere ... suppose for instance we took
a timer interrupt, causing a scheduler rebalance, and moved a process to
an offline CPU at that point? Isn't any user of smp_call_function also racy?

Not locking it just seems fundamentally dangerous to me ... maybe I'm
wrong though.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/