Re: [PATCH] fork: Improve error message for corrupted page tables

From: Anshuman Khandual
Date: Thu Aug 01 2019 - 01:48:14 EST




On 07/31/2019 03:48 AM, Sai Praneeth Prakhya wrote:
> When a user process exits, the kernel cleans up the mm_struct of the user
> process and during cleanup, check_mm() checks the page tables of the user
> process for corruption (E.g: unexpected page flags set/cleared). For
> corrupted page tables, the error message printed by check_mm() isn't very
> clear as it prints the loop index instead of page table type (E.g: Resident
> file mapping pages vs Resident shared memory pages). Hence, improve the
> error message so that it's more informative.

The loop index in check_mm() also happens to be the index in rss_stat[] which
represents individual memory type stats. But you are right, index value here
in the print does not make any sense.

>
> Without patch:
> --------------
> [ 204.836425] mm/pgtable-generic.c:29: bad p4d 0000000089eb4e92(800000025f941467)
> [ 204.836544] BUG: Bad rss-counter state mm:00000000f75895ea idx:0 val:2
> [ 204.836615] BUG: Bad rss-counter state mm:00000000f75895ea idx:1 val:5
> [ 204.836685] BUG: non-zero pgtables_bytes on freeing mm: 20480
>
> With patch:
> -----------
> [ 69.815453] mm/pgtable-generic.c:29: bad p4d 0000000084653642(800000025ca37467)
> [ 69.815872] BUG: Bad rss-counter state mm:00000000014a6c03 type:MM_FILEPAGES val:2
> [ 69.815962] BUG: Bad rss-counter state mm:00000000014a6c03 type:MM_ANONPAGES val:5
> [ 69.816050] BUG: non-zero pgtables_bytes on freeing mm: 20480

Yes, this is definitely better.

>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Suggested-by/Acked-by: Dave Hansen <dave.hansen@xxxxxxxxx>

Though I am not sure, should the above be two separate lines instead ?

> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@xxxxxxxxx>
> ---
> include/linux/mm_types_task.h | 7 +++++++
> kernel/fork.c | 4 ++--
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h
> index d7016dcb245e..881f4ea3a1b5 100644
> --- a/include/linux/mm_types_task.h
> +++ b/include/linux/mm_types_task.h
> @@ -44,6 +44,13 @@ enum {
> NR_MM_COUNTERS
> };
>
> +static const char * const resident_page_types[NR_MM_COUNTERS] = {
> + "MM_FILEPAGES",
> + "MM_ANONPAGES",
> + "MM_SWAPENTS",
> + "MM_SHMEMPAGES",
> +};

Should index them to match respective typo macros.

[MM_FILEPAGES] = "MM_FILEPAGES",
[MM_ANONPAGES] = "MM_ANONPAGES",
[MM_SWAPENTS] = "MM_SWAPENTS",
[MM_SHMEMPAGES] = "MM_SHMEMPAGES",

> +
> #if USE_SPLIT_PTE_PTLOCKS && defined(CONFIG_MMU)
> #define SPLIT_RSS_COUNTING
> /* per-thread cached information, */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 2852d0e76ea3..6aef5842d4e0 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -649,8 +649,8 @@ static void check_mm(struct mm_struct *mm)
> long x = atomic_long_read(&mm->rss_stat.count[i]);
>
> if (unlikely(x))
> - printk(KERN_ALERT "BUG: Bad rss-counter state "
> - "mm:%p idx:%d val:%ld\n", mm, i, x);
> + pr_alert("BUG: Bad rss-counter state mm:%p type:%s val:%ld\n",
> + mm, resident_page_types[i], x);
It changes the print function as well, though very minor change but perhaps
mention that in the commit message ?