On Fri, Jun 29, 2018 at 1:42 PM Larry Finger <Larry.Finger@xxxxxxxxxxxx> wrote:
I have more information regarding this BUG. Line 700 of page-flags.h is the
macro PAGE_TYPE_OPS(Table, table). For further debugging, I manually expanded
the macro, and found that the bug line is VM_BUG_ON_PAGE(!PageTable(page), page)
in routine __ClearPageTable(), which is called from pgtable_page_dtor() in
include/linux/mm.h. I also added a printk call to PageTable() that logs
page->page_type. The routine was called twice. The first had page_type of
0xfffffbff, which would have been expected for a . The second call had
0xffffffff, which led to the BUG.
So it looks to me like the tear-down of the page tables first found a
page that is indeed a page table, and cleared the page table bit
(well, it set it - the bits are reversed).
Then it took an exception (that "interrupt: 700") and that causes
do_exit() again, and it tries to free the same page table - and now
it's no longer marked as a page table, because it already went through
the __ClearPageTable() dance once.
So on the second path through, it catches that "the bit already said
it wasn't a page table" and does the BUG.
But the real question is what the problem was the *first* time around.
I assume that has scrolled off the screen? This part:
_exception_pkey+0x58/0x128
ret_from_except_full+0x0/0x4
--- interrupt: 700 at free_pgd_range+0x19c/0x30c
LR = free_pgd_range+0x19c/0x30c
free_pgtables+0xa/0xb
exit_mnap+0xf4/0x16c
mmput+0x64/0xf0
Does reverting that commit 1d40a5ea01d5 make everything work for you?
Because if so, judging by the deafening silence on this so far, I
think that's what we should do.
That said, can some ppc person who knows the 32-bit ppc code and maybe
knows what that "interrupt: 700" means talk about that oddity in the
trace, please?