Re: [PATCH 1/8] mm: use zone lock guard in reserve_highatomic_pageblock()
From: Dmitry Ilvokhin
Date: Tue Apr 28 2026 - 07:09:19 EST
On Mon, Mar 09, 2026 at 05:45:16PM +0100, Peter Zijlstra wrote:
> On Sat, Mar 07, 2026 at 02:09:41PM +0000, Dmitry Ilvokhin wrote:
> > On Sat, Mar 07, 2026 at 02:16:41PM +0100, Peter Zijlstra wrote:
> > > On Fri, Mar 06, 2026 at 07:24:56PM +0100, Vlastimil Babka wrote:
> > >
> > > > Yeah I don't think the guard construct in this case should be doing anything
> > > > here that wouldn't allow the compiler to compile to the exactly same result
> > > > as before? Either there's some problem with the infra, or we're just victim
> > > > of compiler heuristics. In both cases imho worth looking into rather than
> > > > rejecting the construct.
> > >
> > > I'd love to look into it, but I can't seem to apply these patches to
> > > anything.
> > >
> > > By virtue of not actually having the patches, I had to resort to b4, and
> > > I think the incantation is something like:
> > >
> > > b4 shazam cover.1772811429.git.d@xxxxxxxxxxxx
> > >
> > > but it doesn't want to apply to anything I have at hand. Specifically, I
> > > tried Linus' tree and tip, which is most of what I have at hand.
> >
> > Thanks for taking a look, Peter.
> >
> > This series is based on mm-new and depends on my earlier patchset:
> >
> > https://lore.kernel.org/all/cover.1772206930.git.d@xxxxxxxxxxxx/
> >
> > Those patches are currently only in Andrew's mm-new tree, so this series
> > won't apply cleanly on Linus' tree or tip.
> >
> > It should apply on top of mm-new from:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>
> OK, so the big problem is __GUARD_IS_ERR(), and that came up before, but
> while Linus told me how to fix it, he didn't actually like it very much:
>
> https://lore.kernel.org/all/20250513085001.GC25891@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> However it does help with this:
>
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc-post-gcc-16.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 3/2 up/down: 296/-224 (72)
> Function old new delta
> get_page_from_freelist 6158 6198 +40
> free_pcppages_bulk 678 714 +36
> unreserve_highatomic_pageblock 708 736 +28
> make_alloc_exact 280 264 -16
> alloc_pages_bulk_noprof 1415 1399 -16
> Total: Before=45299, After=45371, chg +0.16%
>
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 3/15 up/down: 277/-363 (-86)
> Function old new delta
> unreserve_highatomic_pageblock 708 757 +49
> free_pcppages_bulk 678 707 +29
> get_page_from_freelist 6158 6165 +7
> try_to_claim_block 1729 1726 -3
> setup_per_zone_wmarks 656 653 -3
> free_pages_prepare 924 921 -3
> calculate_totalreserve_pages 282 279 -3
> alloc_frozen_pages_nolock_noprof 622 619 -3
> __free_pages_prepare 924 921 -3
> __free_pages_ok 1197 1194 -3
> __free_one_page 1330 1327 -3
> __free_frozen_pages 1303 1300 -3
> __rmqueue_pcplist 2786 2777 -9
> free_unref_folios 1905 1894 -11
> setup_per_zone_lowmem_reserve 388 374 -14
> make_alloc_exact 280 264 -16
> __alloc_frozen_pages_noprof 5411 5368 -43
> nr_free_zone_pages 189 138 -51
> Total: Before=45299, After=45213, chg -0.19%
>
>
>
> However, looking at things again, I think we can get rid of that
> unconditional __GUARD_IS_ERR(), something like the below, Dan?
>
> This then gives:
>
> $ ./scripts/bloat-o-meter defconfig-build/mm/page_alloc-pre-gcc-16.o defconfig-build/mm/page_alloc.o | grep -v __UNIQUE
> add/remove: 24/24 grow/shrink: 1/16 up/down: 213/-486 (-273)
> Function old new delta
> free_pcppages_bulk 678 699 +21
> try_to_claim_block 1729 1723 -6
> setup_per_zone_wmarks 656 650 -6
> free_pages_prepare 924 918 -6
> calculate_totalreserve_pages 282 276 -6
> alloc_frozen_pages_nolock_noprof 622 616 -6
> __free_pages_prepare 924 918 -6
> __free_pages_ok 1197 1191 -6
> __free_one_page 1330 1324 -6
> __free_frozen_pages 1303 1297 -6
> free_pages_exact 199 183 -16
> setup_per_zone_lowmem_reserve 388 371 -17
> free_unref_folios 1905 1888 -17
> __rmqueue_pcplist 2786 2768 -18
> nr_free_zone_pages 189 138 -51
> __alloc_frozen_pages_noprof 5411 5359 -52
> get_page_from_freelist 6158 6089 -69
> Total: Before=45299, After=45026, chg -0.60%
>
>
> Anyway, if you all care about the size of things -- those tracepoints
> consume *WAAY* more bytes than any of this.
>
>
> ---
> --- a/include/linux/cleanup.h
> +++ b/include/linux/cleanup.h
> @@ -286,15 +286,18 @@ static __always_inline _type class_##_na
> __no_context_analysis \
> { _type t = _init; return t; }
>
> -#define EXTEND_CLASS(_name, ext, _init, _init_args...) \
> -typedef lock_##_name##_t lock_##_name##ext##_t; \
> +#define EXTEND_CLASS_COND(_name, ext, _cond, _init, _init_args...) \
> +typedef lock_##_name##_t lock_##_name##ext##_t; \
> typedef class_##_name##_t class_##_name##ext##_t; \
> -static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *p) \
> -{ class_##_name##_destructor(p); } \
> +static __always_inline void class_##_name##ext##_destructor(class_##_name##_t *_T) \
> +{ if (_cond) return; class_##_name##_destructor(_T); } \
> static __always_inline class_##_name##_t class_##_name##ext##_constructor(_init_args) \
> __no_context_analysis \
> { class_##_name##_t t = _init; return t; }
>
> +#define EXTEND_CLASS(_name, ext, _init, _init_args...) \
> + EXTEND_CLASS_COND(_name, ext, 0, _init, _init_args)
> +
> #define CLASS(_name, var) \
> class_##_name##_t var __cleanup(class_##_name##_destructor) = \
> class_##_name##_constructor
> @@ -394,12 +397,12 @@ static __maybe_unused const bool class_#
> __DEFINE_GUARD_LOCK_PTR(_name, _T)
>
> #define DEFINE_GUARD(_name, _type, _lock, _unlock) \
> - DEFINE_CLASS(_name, _type, if (!__GUARD_IS_ERR(_T)) { _unlock; }, ({ _lock; _T; }), _type _T); \
> + DEFINE_CLASS(_name, _type, if (_T) { _unlock; }, ({ _lock; _T; }), _type _T); \
> DEFINE_CLASS_IS_GUARD(_name)
>
> #define DEFINE_GUARD_COND_4(_name, _ext, _lock, _cond) \
> __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
> - EXTEND_CLASS(_name, _ext, \
> + EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(*_T), \
> ({ void *_t = _T; int _RET = (_lock); if (_T && !(_cond)) _t = ERR_PTR(_RET); _t; }), \
> class_##_name##_t _T) \
> static __always_inline void * class_##_name##_ext##_lock_ptr(class_##_name##_t *_T) \
> @@ -488,7 +491,7 @@ typedef struct { \
> static __always_inline void class_##_name##_destructor(class_##_name##_t *_T) \
> __no_context_analysis \
> { \
> - if (!__GUARD_IS_ERR(_T->lock)) { _unlock; } \
> + if (_T->lock) { _unlock; } \
> } \
> \
> __DEFINE_GUARD_LOCK_PTR(_name, &_T->lock)
> @@ -565,7 +568,7 @@ __DEFINE_LOCK_GUARD_0(_name, _lock)
>
> #define DEFINE_LOCK_GUARD_1_COND_4(_name, _ext, _lock, _cond) \
> __DEFINE_CLASS_IS_CONDITIONAL(_name##_ext, true); \
> - EXTEND_CLASS(_name, _ext, \
> + EXTEND_CLASS_COND(_name, _ext, __GUARD_IS_ERR(_T->lock), \
> ({ class_##_name##_t _t = { .lock = l }, *_T = &_t;\
> int _RET = (_lock); \
> if (_T->lock && !(_cond)) _T->lock = ERR_PTR(_RET);\
I re-tested my original patchset after rebasing and can still reproduce
the regression (though smaller). It appears to depend on compiler
inlining decisions: in some cases the compiler is able to deduplicate
the cleanup path across multiple return sites, while in others it is
not.
Given that, I think we can go further than just removing
__GUARD_IS_ERR(). It should be possible to eliminate this branch
entirely and simplify the cleanup flow.
https://lore.kernel.org/all/20260427165037.205337-1-d@xxxxxxxxxxxx/
Reposting here to increase visibility, as several people involved in
this code have participated in this thread already.
Any feedback would be appreciated.