Re: [RFC V2] mm: add the zero case to page[1].compound_nr in set_compound_order

From: Matthew Wilcox
Date: Wed Dec 14 2022 - 12:04:17 EST


On Tue, Dec 13, 2022 at 04:45:05PM -0700, Nico Pache wrote:
> Since commit 1378a5ee451a ("mm: store compound_nr as well as
> compound_order") the page[1].compound_nr must be explicitly set to 0 if
> calling set_compound_order(page, 0).
>
> This can lead to bugs if the caller of set_compound_order(page, 0) forgets
> to explicitly set compound_nr=0. An example of this is commit ba9c1201beaa
> ("mm/hugetlb: clear compound_nr before freeing gigantic pages")
>
> Collapse these calls into the set_compound_order by utilizing branchless
> bitmaths [1].
>
> [1] https://graphics.stanford.edu/~seander/bithacks.html#ConditionalSetOrClearBitsWithoutBranching
>
> V2: slight changes to commit log and remove extra '//' in the comments

We don't usually use // comments anywhere in the kernel other than
the SPDX header.

> static inline void set_compound_order(struct page *page, unsigned int order)
> {
> + unsigned long shift = (1U << order);

Shift is a funny name for this variable. order is the shift. this is 'nr'.

> page[1].compound_order = order;
> #ifdef CONFIG_64BIT
> - page[1].compound_nr = 1U << order;
> + // Branchless conditional:
> + // order > 0 --> compound_nr = shift
> + // order == 0 --> compound_nr = 0
> + page[1].compound_nr = shift ^ (-order ^ shift) & shift;

Can the compiler see through this? Before, the compiler sees:

page[1].compound_order = 0;
page[1].compound_nr = 1U << 0;
...
page[1].compound_nr = 0;

and it can eliminate the first store. Now the compiler sees:

unsigned long shift = (1U << 0);
page[1].compound_order = order;
page[1].compound_nr = shift ^ (0 ^ shift) & shift;

Does it do the maths at compile-time, knowing that order is 0 at this
callsite and deducing that it can just store a 0?

I think it might, since shift is constant-1,

page[1].compound_nr = 1 ^ (0 ^ 1) & 1;
-> page[1].compound_nr = 1 ^ 1 & 1;
-> page[1].compound_nr = 0 & 1;
-> page[1].compound_nr = 0;

But you should run it through the compiler and check the assembly
output for __destroy_compound_gigantic_page().