Re: [RFC][PATCH 2/5] mm: Switch mod_state() to __this_cpu_read()

From: Christoph Lameter
Date: Tue Sep 20 2011 - 10:47:04 EST


On Mon, 19 Sep 2011, Steven Rostedt wrote:

> On Mon, 2011-09-19 at 17:02 -0500, Christoph Lameter wrote:
> > On Mon, 19 Sep 2011, Steven Rostedt wrote:
> >
> > > From: Steven Rostedt <srostedt@xxxxxxxxxx>
> > >
> > > The code in mod_state() is already made to handle the raciness of
> > > this_cpu_read(). Have the code use __this_cpu_read() instead so
> > > the debug code does not trigger warnings about it.
> >
> > Why would there be a warning triggered? this_cpu_read should take care of
> > disabling preemption for the read if needed. In fact the fallback case
> > does do exactly that.
>
> What does disabling preemption for just reading the cpu variable help
> anywhere? Once you read the variable, and enable preemption, you can
> migrate. Then that info you have could be useless.

The effects on just reading are not that severe. We may calculate the
address of the per cpu variable on one cpu and then be migrated to another
processor. Then there is an access to the per cpu area of another
processor which in many cases is harmless and will just cause the
cacheline to go from exclusive mode into shared mode on the other cpu. The
next write by the other cpu then brings it back to exclusive mode evicting
the cacheline from the cpu we migrated to.

The same conventions are also used for RMV instructions like this_cpu_inc.
In that case the situation is much more sever. Now we are locklessly
incrementing a counter in the per cpu area of another cpu. That is a race
condition that can corrupt the counter data.

> The point is, most places use this_cpu_read(). Adding a "__" or "raw_"
> prefix usually means that less checks are done. Think of the
> copy_from_user(). You have __copy_from_user() that does less checks.

Why do you need any checks there? this_cpu_xxx operations should take care
of preemption.

> Most of the per_cpu() and get_cpu_var() code was blindly replaced with
> the this_cpu_*() variants. The original code had preemption disabled
> checks. Now they don't. Which means if you use the this_cpu_read() and
> then make some decision on it, that read may be useless and buggy.
>
> We have smp_processor_id() that does the check (and the old per_cpu()
> and get_cpu_var()). Now we have this_cpu_read() which replaced most of
> the per_cpu() code in the kernel, making them vulnerable to bugs when
> preemption is enabled.

this_cpu_xx can only replace a get_cpu_var/per_cpu section where a single
word is read or updated. If multiple per cpu variables are accessed and
modified in one go then it is better to keep disabling preemption for
the whole section and use the __this_cpu_xx operations.

If there are cases where this was not done correctly then lets fix those.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/