Re: [RFC PATCH v1 06/57] mm: Remove PAGE_SIZE compile-time constant assumption
From: Ryan Roberts
Date: Tue Nov 26 2024 - 05:08:55 EST
Hi Vlastimil,
Sorry about the slow response to your review of this series - I'm just getting
to it now. Comment's below...
On 14/11/2024 10:17, Vlastimil Babka wrote:
> On 10/14/24 12:58, Ryan Roberts wrote:
>> To prepare for supporting boot-time page size selection, refactor code
>> to remove assumptions about PAGE_SIZE being compile-time constant. Code
>> intended to be equivalent when compile-time page size is active.
>>
>> Refactor "struct vmap_block" to use a flexible array for used_mmap since
>> VMAP_BBMAP_BITS is not a compile time constant for the boot-time page
>> size case.
>>
>> Update various BUILD_BUG_ON() instances to check against appropriate
>> page size limit.
>>
>> Re-define "union swap_header" so that it's no longer exactly page-sized.
>> Instead define a flexible "magic" array with a define which tells the
>> offset to where the magic signature begins.
>>
>> Consider page size limit in some CPP condditionals.
>>
>> Wrap global variables that are initialized with PAGE_SIZE derived values
>> using DEFINE_GLOBAL_PAGE_SIZE_VAR() so their initialization can be
>> deferred for boot-time page size builds.
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
>> ---
>>
>> ***NOTE***
>> Any confused maintainers may want to read the cover note here for context:
>> https://lore.kernel.org/all/20241014105514.3206191-1-ryan.roberts@xxxxxxx/
>>
>> drivers/mtd/mtdswap.c | 4 ++--
>> include/linux/mm.h | 2 +-
>> include/linux/mm_types_task.h | 2 +-
>> include/linux/mmzone.h | 3 ++-
>> include/linux/slab.h | 7 ++++---
>> include/linux/swap.h | 17 ++++++++++++-----
>> include/linux/swapops.h | 6 +++++-
>> mm/memcontrol.c | 2 +-
>> mm/memory.c | 4 ++--
>> mm/mmap.c | 2 +-
>> mm/page-writeback.c | 2 +-
>> mm/slub.c | 2 +-
>> mm/sparse.c | 2 +-
>> mm/swapfile.c | 2 +-
>> mm/vmalloc.c | 7 ++++---
>> 15 files changed, 39 insertions(+), 25 deletions(-)
>>
>
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -132,10 +132,17 @@ static inline int current_is_kswapd(void)
>> * bootbits...
>> */
>> union swap_header {
>> - struct {
>> - char reserved[PAGE_SIZE - 10];
>> - char magic[10]; /* SWAP-SPACE or SWAPSPACE2 */
>> - } magic;
>> + /*
>> + * Exists conceptually, but since PAGE_SIZE may not be known at compile
>> + * time, we must access through pointer arithmetic at run time.
>> + *
>> + * struct {
>> + * char reserved[PAGE_SIZE - 10];
>> + * char magic[10]; SWAP-SPACE or SWAPSPACE2
>> + * } magic;
>> + */
>> +#define SWAP_HEADER_MAGIC (PAGE_SIZE - 10)
>> + char magic[1];
>
> I wonder if it makes sense to even keep this magic field anymore.
>
>> struct {
>> char bootbits[1024]; /* Space for disklabel etc. */
>> __u32 version;
>> @@ -201,7 +208,7 @@ struct swap_extent {
>> * Max bad pages in the new format..
>> */
>> #define MAX_SWAP_BADPAGES \
>> - ((offsetof(union swap_header, magic.magic) - \
>> + ((SWAP_HEADER_MAGIC - \
>> offsetof(union swap_header, info.badpages)) / sizeof(int))
>>
>> enum {
>
> <snip>
>
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -2931,7 +2931,7 @@ static unsigned long read_swap_header(struct swap_info_struct *p,
>> unsigned long swapfilepages;
>> unsigned long last_page;
>>
>> - if (memcmp("SWAPSPACE2", swap_header->magic.magic, 10)) {
>> + if (memcmp("SWAPSPACE2", &swap_header->magic[SWAP_HEADER_MAGIC], 10)) {
>
> I'd expect static checkers to scream here because we overflow the magic[1]
> both due to copying 10 bytes into 1 byte array and also with the insane
> offset. Hence my suggestion to drop the field and use purely pointer arithmetic.
Yeah, good point. I'll remove magic[] and use pointer arithmetic.
>
>> pr_err("Unable to find swap-space signature\n");
>> return 0;
>> }
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index a0df1e2e155a8..b4fbba204603c 100644
>
> Hm I'm actually looking at yourwip branch which also has:
>
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -969,7 +969,7 @@ static inline int get_order_from_str(const char *size_str)
> return -EINVAL;
> }
>
> -static char str_dup[PAGE_SIZE] __initdata;
> +static char str_dup[PAGE_SIZE_MAX] __initdata;
> static int __init setup_thp_anon(char *str)
> {
> char *token, *range, *policy, *subtoken;
>
> Why PAGE_SIZE_MAX? Isn't this the same case as "mm/memcontrol: Fix seq_buf
> size to save memory when PAGE_SIZE is large"
Hmm, you're probably right. I had a vague notion that "str", as passed into the
function, was guarranteed to be no bigger than PAGE_SIZE (perhaps I'm wrong). So
assumed that's where the original definition of str_dup[PAGE_SIZE] was coming from.
But I think your real question is "should the max size of str be a function of
PAGE_SIZE?". I think it could; there are more page orders that can legitimately
be described when the page size is bigger (at least for arm64). But in practice,
I'd expect any sane string for any page size to be easily within 4K.
So on that basis, I'll take your advice; changing this buffer to be 4K always.
Thanks,
Ryan