Re: [PATCH v11 1/4] mm/page_owner: add print_mode filter

From: Vlastimil Babka (SUSE)

Date: Mon Jun 29 2026 - 05:35:44 EST


On 6/29/26 04:59, Ye Liu wrote:
>
> 在 2026/6/25 12:30, Zhen Ni 写道:
>> Add a print_mode filter to page_owner that allows users to choose between
>> printing stack traces, stack handles, or both, providing flexibility for
>> different debugging and analysis scenarios.
>>
>> The filter provides three modes via page_owner:
>> - Writing "mode=stack" prints stack traces for each page (default)
>> - Writing "mode=handle" prints only the handle number
>> - Writing "mode=stack_handle" prints both stack traces and handles
>>
>> The default stack mode maintains backward compatibility with existing
>> usage, displaying complete stack traces for each page allocation.
>>
>> The handle mode dramatically reduces log size and improves performance by
>> showing only the handle number instead of the full stack trace. Testing
>> shows handle mode reduces output size by ~66% (84MB vs 244MB) and
>> improves read performance by ~4.4x compared to full stack output. The
>> mapping from handles to actual stack traces can be obtained via the
>> show_stacks_handles interface.
>>
>> The stack_handle mode prints both stack traces and handles, making it
>> easier to identify pages with the same allocation pattern by comparing
>> handle numbers instead of comparing large stack traces.
>>
>> Example usage:
>> # Using the page_owner_filter tool (recommended)
>> ./page_owner_filter -m stack # Print only stack traces (default)
>> ./page_owner_filter -m handle # Print only handles
>> ./page_owner_filter -m stack_handle # Print both stack and handles
>>
>> Sample output (handle mode):
>> Page allocated via order 0, migratetype Unmovable, gfp_mask 0x1100ca,
>> pid 1, tgid 1 (systemd), ts 123456789 ns
>> PFN 0x1000 type Unmovable Block 1 type Unmovable
>> Flags 0x3fffe800000084(referenced|lru|active|private|node=0|zone=1)
>> handle: 17432583
>> ...
>>
>> This implementation uses per-file-descriptor filter state stored in
>> file->private_data, allowing each opener to have independent filter
>> configuration.
>>
>> Signed-off-by: Zhen Ni <zhen.ni@xxxxxxxxxxxx>
>> ---
>> Changes in v11:
>> - No changes
>>
>> Changes in v10:
>> - No changes
>>
>> Changes in v9:
>> - Add spinlock_t lock to struct page_owner_filter_state for concurrent access protection
>>
>> Changes in v8:
>> - Fix buffer overflow by adding bounds check between stack_depot_snprint() and scnprintf()
>> - Fix unsafe string handling: use memdup_user_nul() instead of kmalloc_objs + strncpy_from_user()
>> - Fix strsep() memory corruption by saving original pointer before strsep() call
>> - Change format specifier from %d to %u for depot_stack_handle_t
>>
>> Changes in v7:
>> - per-file-descriptor implementation
>>
>> Changes in v6:
>> - Remove unnecessary braces in if/else statement (coding style)
>> - Use stack array (char kbuf[33]) instead of kmalloc for input buffer
>>
>> Changes in v5:
>> - No code changes
>>
>> Changes in v4:
>> - Change from numeric (0/1) to string-based interface ("full_stack"/"stack_handle")
>> - Merge infrastructure patch into this patch
>>
>> Changes in v3:
>> - No code changes
>>
>> Changes in v2:
>> - Renamed from 'compact mode' to 'print_mode' for better clarity
>> - Use enum values (0=full_stack, 1=stack_handle) instead of boolean
>> - Update debugfs filename from 'compact' to 'print_mode'
>>
>> v10: https://lore.kernel.org/linux-mm/20260618035750.3724613-2-zhen.ni@xxxxxxxxxxxx/
>> v9: https://lore.kernel.org/linux-mm/20260525081652.2210206-2-zhen.ni@xxxxxxxxxxxx/
>> v8: https://lore.kernel.org/linux-mm/20260520075641.1931080-2-zhen.ni@xxxxxxxxxxxx/
>> v7: https://lore.kernel.org/linux-mm/20260515091942.1535677-2-zhen.ni@xxxxxxxxxxxx/
>> v6: https://lore.kernel.org/linux-mm/20260511033017.747781-2-zhen.ni@xxxxxxxxxxxx/
>> v5: https://lore.kernel.org/linux-mm/20260507064643.179187-2-zhen.ni@xxxxxxxxxxxx/
>> v4: https://lore.kernel.org/linux-mm/20260430163247.13628-2-zhen.ni@xxxxxxxxxxxx/
>> v3: https://lore.kernel.org/linux-mm/20260428071112.1420380-2-zhen.ni@xxxxxxxxxxxx/
>> https://lore.kernel.org/linux-mm/20260428071112.1420380-3-zhen.ni@xxxxxxxxxxxx/
>> v2: https://lore.kernel.org/linux-mm/20260419155540.376847-2-zhen.ni@xxxxxxxxxxxx/
>> https://lore.kernel.org/linux-mm/20260419155540.376847-3-zhen.ni@xxxxxxxxxxxx/
>> v1: https://lore.kernel.org/linux-mm/20260417154638.22370-2-zhen.ni@xxxxxxxxxxxx/
>> https://lore.kernel.org/linux-mm/20260417154638.22370-3-zhen.ni@xxxxxxxxxxxx/
>> ---
>> mm/page_owner.c | 129 +++++++++++++++++++++++++++++++++++++++++++++---
>> 1 file changed, 123 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/page_owner.c b/mm/page_owner.c
>> index 8178e0be557f..7595735979bf 100644
>> --- a/mm/page_owner.c
>> +++ b/mm/page_owner.c
>> @@ -54,6 +54,23 @@ struct stack_print_ctx {
>> u8 flags;
>> };
>>
>> +enum page_owner_print_mode {
>> + PAGE_OWNER_PRINT_STACK,
>> + PAGE_OWNER_PRINT_HANDLE,
>> + PAGE_OWNER_PRINT_STACK_HANDLE,
>> +};
>> +
>> +static const char * const page_owner_print_mode_strings[] = {
>> + [PAGE_OWNER_PRINT_STACK] = "stack",
>> + [PAGE_OWNER_PRINT_HANDLE] = "handle",
>> + [PAGE_OWNER_PRINT_STACK_HANDLE] = "stack_handle",
>> +};
>> +
>> +struct page_owner_filter_state {
>> + enum page_owner_print_mode print_mode;
>> + spinlock_t lock;
> Hi , Zhen
> The spinlock in struct page_owner_filter_state is unnecessary and adds significant overhead in the read path.
>                                                                                                     
> 1. Per-fd isolation: the state is allocated per open() and stored in file->private_data.
> There is no cross-fd contention possible.
> 2. Hot path cost: the lock is taken for every single page in read_page_owner() and
> print_page_owner(). A single read can traverse millions of pages, each paying
> spin_lock_irqsave/irqrestore — including interrupt disable — just to read a mode
> enum or check a nodemask. This is measurable overhead for no real benefit.
> 3. No practical race: nobody writes filter config to an fd while simultaneously reading from it. 
>                                                                                                     
> Suggest dropping the lock entirely. 
>                                                                                                     
> Just my take though — happy to follow whatever the other reviewers prefer here.

I agree. If someone is writing (updating filter) and reading (getting
page_owner output) at the same time from multiple threads, they might get
inconsistent results but that's getting what you ask for. Importantly it
can't cause any crash, AFAICS.