Re: [BUG] WARNING: CPU: 3 PID: 1 at mm/debug_vm_pgtable.c:493

From: Anshuman Khandual
Date: Mon Nov 22 2021 - 01:31:57 EST




On 11/19/21 12:03 AM, Linus Torvalds wrote:
> On Thu, Nov 18, 2021 at 8:47 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>> Triggered it again with the new update:
>>
>> [ 24.751779] IPI shorthand broadcast: enabled
>> [ 24.761177] sched_clock: Marking stable (23431856262, 1329270511)->(28163092341, -3401965568)
>> [ 24.770495] device: 'cpu_dma_latency': device_add
>> [ 24.775232] PM: Adding info for No Bus:cpu_dma_latency
>> [ 24.780929] debug_vm_pgtable: [debug_vm_pgtable ]: Validating architecture page table helpers
>> [ 24.799490] mtrr_type_lookup() returned 0 (0)
> Ok, so that's MTRR_TYPE_UNCACHABLE, and "uniform" is 0.
>
> Anyway, either the mtrr code is confused, or more likely it just does
> the right thing, and pud_set_huge() is simply expected to return 0 in
> this situation, and that WARN_ON() in pud_huge_tests() is simply wrong
> to trigger at all.
>
> I didn't look at what all the code in debug_vm_pgtable() is trying to
> set up to test. Honestly, it's all very opaque.
>
> But I do notice that the pfn that the test uses ends up basically
> being something random, where the "fixed" pfn is
>
> phys = __pa_symbol(&start_kernel);
> ...
> args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);
>
> rather than being an allocated real PUD-sized page. That can be a
> problem in itself.
>
> So I think the problem is that depending on where the kernel is
> allocated, the fixed_pud_pfn ends up being in an area with MTRR
> settings. In fact, I'm surprised it's not *always* in that area, since
> presumabl;y you have the normal fixed MTRR issues with the 640k-1M
> range.
>
> But I didn't look - probably the MTRR code doesn't actually check the
> special fixed MTRR's.
>
> Anyway, I think that the end result is simply that the tests in
> mm/debug_vm_pgtable.c are simply buggy, and the WARN_ON() is not a
> sign of anything wrong in the mm, but with the tests themselves.
>
> So the fixed_pud_pfn is dodgy, but it looks like the non-fixed
> 'pud_pfn' allocation may be dodgy too:
>
> #ifdef CONFIG_CONTIG_ALLOC
> if (order >= MAX_ORDER) {
> page = alloc_contig_pages((1 << order), GFP_KERNEL,
> first_online_node, NULL);
>
> because afaik, alloc_contig_pages() does allocate a contiguous region,
> but it doesn't necessarily allocate a _aligned_ contiguous region.
>
> So I think _all_ those PUD tests are likely broken, but honestly, I
> don't know the code well enough to be entirely sure, I'm just seeing
> code that looks dodgy to me.
>
> I don't think the breakage is x86-specific. Quite the reverse. I think
> the x86 code just happens to randomly show it when some MTRR ends up
> being used.
>
> Maybe pfn_pud() should verify that it's actually given an aligned argument?
>
> Gavin, Anshuman? Feel free to tell me what I missed.

Hi Linus,

These PUD tests have been subtle (including their problems as seen here
in this report) on certain platforms. I will definitely take a detailed
look, but probably after an week (leave, travel etc). Thank you.

- Anshuman