Re: [PATCH] mm: fix a data race in put_page()

From: Marco Elver
Date: Fri Feb 07 2020 - 08:18:00 EST


On Fri, 7 Feb 2020 at 01:55, Qian Cai <cai@xxxxxx> wrote:
>
> > On Feb 6, 2020, at 7:27 PM, John Hubbard <jhubbard@xxxxxxxxxx> wrote:
> >
> > On 2/6/20 4:18 PM, Qian Cai wrote:
> >>> On Feb 6, 2020, at 6:34 PM, John Hubbard <jhubbard@xxxxxxxxxx> wrote:
> >>> On 2/6/20 7:23 AM, Qian Cai wrote:
> >>>>> On Feb 6, 2020, at 9:55 AM, Jan Kara <jack@xxxxxxx> wrote:
> >>>>> I don't think the problem is real. The question is how to make KCSAN happy
> >>>>> in a way that doesn't silence other possibly useful things it can find and
> >>>>> also which makes it most obvious to the reader what's going on... IMHO
> >>>>> using READ_ONCE() fulfills these targets nicely - it is free
> >>>>> performance-wise in this case, it silences the checker without impacting
> >>>>> other races on page->flags, its kind of obvious we don't want the load torn
> >>>>> in this case so it makes sense to the reader (although a comment may be
> >>>>> nice).
> >>>>
> >>>> Actually, use the data_race() macro there fulfilling the same purpose too, i.e, silence the splat here but still keep searching for other races.
> >>>>
> >>>
> >>> Yes, but both READ_ONCE() and data_race() would be saying untrue things about this code,
> >>> and that somewhat offends my sense of perfection... :)
> >>>
> >>> * READ_ONCE(): this field need not be restricted to being read only once, so the
> >>> name is immediately wrong. We're using side effects of READ_ONCE().
> >>>
> >>> * data_race(): there is no race on the N bits worth of page zone number data. There
> >>> is only a perceived race, due to tools that look at word-level granularity.
> >>>
> >>> I'd propose one or both of the following:
> >>>
> >>> a) Hope that Marco (I've fixed the typo in his name. --jh) has an idea to enhance KCSAN so as to support this model of
> >>> access, and/or

>From the other thread:

On Fri, 7 Feb 2020 at 00:18, John Hubbard <jhubbard@xxxxxxxxxx> wrote:
>
> Yes. I'm grasping at straws now, but...what about the idiom that page_zonenum()
> uses: a set of bits that are "always" (after a certain early point) read-only?
> What are your thoughts on that?

Without annotations it's hard to tell. The problem is that the
compiler can still emit a word-sized load, even if you're just
checking 1 bit. The instrumentation emitted for KCSAN only cares about
loads/stores, where access size is in number of bytes and not bits,
since that's what the compiler has to emit. So, strictly speaking
these are data races: concurrent reads / writes where at least one
access is plain.

With the above caveat out of the way, we already have the following
defaults in KCSAN (after similar requests):
1. KCSAN ignores same-value stores, i.e. races with writes that appear
to write the same value do not result in data race reports.
2. KCSAN does not demand aligned writes (including things like 'var++'
if there are no concurrent writers) up to word size to be marked (with
READ_ONCE etc.), assuming there is no way for the compiler to screw
these up. [I still recommend writes to be marked though, if at all
possible, because I'm still not entirely convinced it's always safe!]

So, because of (2), KCSAN will not complain if you have something like
'flags |= SOME_FLAG' (where the read is marked). Because of (1), it'll
still complain about 'flags & SOME_FLAG' though, since the load is not
marked, and only sees this is a word-sized access (assuming flags is a
long) where a bit changed.

I don't think it's possible to easily convey to KCSAN which bits of an
access you only care about, so that we could extend (1). Ideas?

> >> A similar thing was brought up before, i.e., anything compared to zero is immune to load-tearing
> >> issues, but it is rather difficult to implement it in the compiler, so it was settled to use data_race(),
> >>
> >> https://lore.kernel.org/lkml/CANpmjNN8J1oWtLPHTgCwbbtTuU_Js-8HD=cozW5cYkm8h-GTBg@xxxxxxxxxxxxxx/#r
> >>
> >
> > Thanks for that link to the previous discussion, good context.
> >
> >>>
> >>> b) Add a new, better-named macro to indicate what's going on. Initial bikeshed-able
> >>> candidates:
> >>>
> >>> READ_RO_BITS()
> >>> READ_IMMUTABLE_BITS()
> >>> ...etc...
> >>>

This could work, but 'READ_BITS()' is enough, if KCSAN's same-value
filter is default on anyway. Although my preference is also to avoid
more macros if possible.

> >> Actually, Linus might hate those kinds of complication rather than a simple data_race() macro,
> >>
> >> https://lore.kernel.org/linux-fsdevel/CAHk-=wg5CkOEF8DTez1Qu0XTEFw_oHhxN98bDnFqbY7HL5AB2g@xxxxxxxxxxxxxx/
> >>
> >
> > Another good link. However, my macros above haven't been proposed yet, and I'm perfectly
> > comfortable proposing something that Linus *might* (or might not!) hate. No point in
> > guessing about it, IMHO.
> >
> > If you want, I'll be happy to put on my flame suit and post a patchset proposing
> > READ_IMMUTABLE_BITS() (or a better-named thing, if someone has another name idea). :)
> >
>
> BTW, the current comment said (note, it is called âaccessâ which in this case it does read the whole word
> rather than those 3 bits, even though it is only those bits are of interested for us),
>
> /*
> * data_race(): macro to document that accesses in an expression may conflict with
> * other concurrent accesses resulting in data races, but the resulting
> * behaviour is deemed safe regardless.
> *
> * This macro *does not* affect normal code generation, but is a hint to tooling
> * that data races here should be ignored.
> */
>
> Macro might have more to say.

I agree that data_race() would be the most suitable here, since
technically it's still a data race. It just so happens that we end up
"throwing away" most of the bits of the access, and just care about a
few bits.

Thanks,
-- Marco