Re: [PATCH 2/2] mm,memory_hotplug: {READ,WRITE}_ONCE unsynchronized zone data

From: Brendan Jackman
Date: Fri May 31 2024 - 12:41:17 EST


On Wed, May 22, 2024 at 02:11:16PM +0000, Brendan Jackman wrote:
> On Wed, May 22, 2024 at 04:05:12PM +0200, David Hildenbrand wrote:
> > On 21.05.24 14:57, Brendan Jackman wrote:
> > > + return zone->zone_start_pfn + READ_ONCE(zone->spanned_pages);
> >
> > It's weird to apply that logic only to spanned_pages, whereby zone_start_pfn
> > can (and will) similarly change when onlining/offlining memory.
> >
> Oh, yep. For some reason I had decided that zone_start_pfn was fixed
> but that is (actually very obviously) not true!
>
> Will take a closer look and extend v2 to cover that too, unless
> someone finds a reason this whole patch is nonsense.
>
> Thanks for the review.

Hmm so while poking around during spare moments this week I learned
that compaction.c also stores a bunch of data in struct zone that is
unsynchronized.

It seems pretty unlikely that you can corrupt any memory there (unless
there's some race possible with pfn_to_online_page, which is an
orthogonal question), but it does seem like if the compiler gets smart
with us we could maybe have a compaction run that takes quasi-forever
or something weird like that.

It seems easy enough to just spam READ_ONCE/WRITE_ONCE everywhere
there too, this would remove that risk, make KCSAN happy and serve as
a kinda "this is unsynchronized, take care" comment. (There's also at
least one place where we could put data_race()).

On the other hand it's a bit verbose & visually ugly. Personally I
think it's a pretty minor downside, but anyone feel differently?