Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

From: Guenter Roeck
Date: Wed May 20 2020 - 15:03:39 EST


On 5/20/20 10:03 AM, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>> Now that the page table allocator can free page table allocations
>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>> to avoid needlessly wasting memory.
>>>>>
>>>>> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
>>>>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>>>> Signed-off-by: Will Deacon <will@xxxxxxxxxx>
>>>>
>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>> does reverting the rest of the series.
>>>>
>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
>>> the results here as well as soon as I have it.
>>
>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>> issues. However, linux-next is a different story, where I don't get very far
>> at all:
>>
>> BUG: Bad page state in process swapper pfn:005b4
>
> This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
> regions rather that check each PFN"). The commit sha is valid for
> v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
>
> As it seems, sparc32 never registered the memory occupied by the kernel
> image with memblock_add() and it only reserves this memory with
> meblock_reserve().
>
> I don't know what would happen on real HW, but with
>
> qemu-system-sparc -kernel /path/to/kernel
>
> the memory occupied by the kernel is reserved in openbios and removed
> from mem.available. The prom setup code in the kernel used mem.available
> to set up the memory banks and essentially there is a hole for the
> memory occupied by the kernel.
>
> Later in bootmem_init() this memory is memblock_reserve()d.
>
> Before the problematic commit, memmap initialization would call
> __init_single_page() for the pages in that hole, the
> free_low_memory_core_early() would mark them as resrved and everything
> would be Ok.
>
> After the change in memmap initialization, the hole is skipped and the
> page structs for it are not inited. And when they are passed from
> memblock to page allocator as reserved it gets confused.
>
> Simply registering the memory occupied by the kernel with memblock_add()
> resolves this issue, at least for qemu-system-arm and I cannot see how
> it can harm any other setup.
>
> If all that makes sense I'll send a proper patch :)
>
> diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
> index 906eda1158b4..3cb3dffcbcdc 100644
> --- a/arch/sparc/mm/init_32.c
> +++ b/arch/sparc/mm/init_32.c
> @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
> /* Reserve the kernel text/data/bss. */
> size = (start_pfn << PAGE_SHIFT) - phys_base;
> memblock_reserve(phys_base, size);
> + memblock_add(phys_base, size);
>
> size = memblock_phys_mem_size() - memblock_reserved_size();
> *pages_avail = (size >> PAGE_SHIFT) - high_pages;
>
>> Will
>

With above patch applied on top of Ira's patch, I get:

BUG: spinlock recursion on CPU#0, S01syslogd/139
lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
[f0067a64 :
do_raw_spin_lock+0xa8/0xd8 ]
[f00d5034 :
copy_page_range+0x328/0x804 ]
[f0025be4 :
dup_mm+0x334/0x434 ]
[f0027124 :
copy_process+0x1224/0x12b0 ]
[f0027344 :
_do_fork+0x54/0x30c ]
[f0027670 :
do_fork+0x5c/0x6c ]
[f000de44 :
sparc_do_fork+0x18/0x38 ]
[f000b7f4 :
do_syscall+0x34/0x40 ]
[5010cd4c :
0x5010cd4c ]

Looks like yet another problem.

I can not revert c03584e30534 because it results in a compile failure.

Guenter