Re: kmemleak panic

From: Qian Cai
Date: Fri Jan 18 2019 - 11:14:07 EST




On 1/18/19 10:36 AM, Marc Gonzalez wrote:
> On 18/01/2019 15:34, Catalin Marinas wrote:
>
>> On Fri, Jan 18, 2019 at 02:36:46PM +0100, Marc Gonzalez wrote:
>>
>>> Trying to diagnose a separate issue, I enabled a raft of debugging options,
>>> including kmemleak. However, it looks like kmemleak itself is crashing?
>>>
>>> We seem to be crashing on this code:
>>>
>>> kasan_disable_current();
>>> pointer = *ptr;
>>> kasan_enable_current();
>>
>> There was another regression reported recently:
>>
>> http://lkml.kernel.org/r/51e79597-21ef-3073-9036-cfc33291f395@xxxxxx
>>
>> See if reverting commit 9f1eb38e0e113 (mm, kmemleak: little optimization while
>> scanning) fixes it.
>>
>
> [Drop LAKML, add LKML, add recipients]
>
> Bug is easy to reproduce:
>
> boot
> mount -t debugfs nodev /sys/kernel/debug/
> echo scan > /sys/kernel/debug/kmemleak
>
> Unable to handle kernel paging request at virtual address ffffffc021e00000
> Mem abort info:
> ESR = 0x96000006
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000006
> CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
> [ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000
> Internal error: Oops: 96000006 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 635 Comm: exe Not tainted 5.0.0-rc1 #16
> Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
> pstate: 80000085 (Nzcv daIf -PAN -UAO)
> pc : scan_block+0x70/0x190
> lr : scan_block+0x6c/0x190
> sp : ffffff80133b3b20
> x29: ffffff80133b3b20 x28: ffffffc0fdbaf018
> x27: ffffffc022000000 x26: 0000000000000080
> x25: ffffff80118bdf70 x24: ffffffc0f8cc8000
> x23: ffffff8010bd8000 x22: ffffff8010bd8830
> x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
> x19: ffffffc021e00000 x18: 00000000000025fd
> x17: 0000000000000200 x16: 0000000000000000
> x15: ffffff8010c24dd8 x14: 00000000000025f9
> x13: 00000000445b0e6c x12: ffffffc0f5a96658
> x11: 0000000000000001 x10: ffffff8010bae688
> x9 : ffffff8010baf000 x8 : ffffff8010bae688
> x7 : 0000000000000003 x6 : 0000000000000000
> x5 : ffffff801133ad60 x4 : 0000000000002878
> x3 : ffffff8010c24d88 x2 : 7c512d102eca1300
> x1 : ffffffc0f5a81b00 x0 : 0000000000000000
> Process exe (pid: 635, stack limit = 0x(____ptrval____))
> Call trace:
> scan_block+0x70/0x190
> scan_gray_list+0x108/0x1c0
> kmemleak_scan+0x33c/0x7c0
> kmemleak_write+0x410/0x4b0
> full_proxy_write+0x68/0xa0
> __vfs_write+0x60/0x190
> vfs_write+0xac/0x1a0
> ksys_write+0x6c/0xe0
> __arm64_sys_write+0x24/0x30
> el0_svc_handler+0xc0/0x160
> el0_svc+0x8/0xc
> Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
> ---[ end trace 8797ac2fea89abd6 ]---
> note: exe[635] exited with preempt_count 2
>
>
>
> Reverting 9f1eb38e0e1131e75cc4ac684391b25d70282589 does not help:

This looks like something different from the original "invalid PFNs from
pfn_to_online_page()" issue. What's your .config ?