Re: [GIT PULL] percpu fixes for v4.14-rc3

From: Tejun Heo
Date: Tue Oct 03 2017 - 14:33:06 EST

Hello, Linus.

On Tue, Oct 03, 2017 at 10:27:42AM -0700, Linus Torvalds wrote:
> On Tue, Oct 3, 2017 at 6:26 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> >
> > * Mark noticed that the generic implementations of percpu local atomic
> > reads aren't properly protected against irqs and there's a (slim)
> > chance for split reads on some 32bit systems.
> Grr.
> Do we really want to support 64-bit percpu operations on 32-bit architectures?

I don't know. However, AFAICS, no 32bit arch provides 64bit
optimizations and we've been doing the same irq-disable protection on
all !read percpu ops, so bringing reads in line is at least the
immediately right thing to do.

> It does kind of break the whole point of percpu operations, and I
> would like to point out that I find things like
> static DEFINE_PER_CPU(u64, running_sample_length);
> which is preceded by a comment that talks about how this is accessed
> from critical code and explicitly mentions NMI's.
> So protection them against interrupts isn't actually going to *fix* anything.

NMI users would have to do their own protection (running_sample_length
seems to be accessed only from NMI context) but there are quite a few
existing irq users which were risking a very slight chance of doing
corrupt split reads.

> Doing a
> git grep DEFINE_PER_CPU.*64
> isn't likely to find everything, but maybe it's a representative
> sample. There aren't that many of those things, and some of them are
> very much ok (ie only 64-bit architectures, or explicitly using
> "atomic64_t" to avoid access issues)
> I dunno. I have pulled you change, but it does make me go "people are
> doing something wrong".
> Maybe we could just aim to disallow everything but CPU-native accesses?

Here are a couple points to consider.

* On a lot of archs, most of percpu operations need to be protected
explicitly anyway regardless of size. This patch shifts things a
bit worse but not drastically. IOW, removing 64bit support on 32bit
isn't gonna remove most of explicit context protections.

* Using 64bit percpu ops is a bit of cop-out, where we trade off some
overhead on 32bit for performance / simplicity on 64bit, which
doesn't seem too different from what we do when we use explicit
64bit variables in general. And we do the latter quite a bit.

The question is whether we want to force percpu users to explicitly
worry about 32bit machines and shape the code accordingly, which can
possibly incur overhead / complexity on 64bit while resulting in
better code on 32bit.

Given that we need explicit protections on most operations anyway, I
lean towards keeping it. I don't think removing 64-on-32 support will
buy us anything noticeable enough to justify the inconvenience.