On Mon, 11 Jan 2021 at 19:01, Christian König <christian.koenig@xxxxxxx> wrote:
Changing the page table attributes while releasing memory might sleep.Can you look also the first trace?
So we can't use a spinlock here.
Thanks for the report, a patch to fix this is on the mailing list now.
Here a same error message "sleeping function called from invalid
context" and a lot of [amdgpu] code.
Yes, the monitor still turns off after applying patch "make the pool-12 is just -ENOMEM. Looks like a memory leak to me, maybe caused byThe looks like a completely unrelated memory leak to me.
the problem above, maybe something completely unrelated.
I will take a look.
Probably best if you open up a bug report for this.
shrinker lock a mutex".
Anyway patch fixed the issue with flood of message "BUG: sleeping
function called from invalid context at mm/vmalloc.c:1756" so kernel
log became cleaner.
Now the issue with turns off monitor looks in logs so:
DMA-API: cacheline tracking ENOMEM, dma-debug disabled
amdgpu 0000:0b:00.0: amdgpu: 000000006b791523 pin failed
[drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin
framebuffer with error -12
BUG: kernel NULL pointer dereference, address: 0000000000000060
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 20 PID: 3780 Comm: brave:cs0 Tainted: G W ---------
--- 5.11.0-0.rc2.20210108gitf5e6c330254a.120.fc34.x86_64 #1
Hardware name: System manufacturer System Product Name/ROG STRIX
X570-I GAMING, BIOS 2802 10/21/2020
RIP: 0010:ttm_tt_swapin+0x34/0x1b0 [ttm]
Code: 55 41 54 55 53 48 83 ec 10 48 8b 47 20 48 89 44 24 08 48 85 c0
0f 84 86 01 00 00 48 8b 44 24 08 49 89 fc 4c 8b a8 e0 01 00 00 <41> 8b
45 60 89 44 24 04 8b 47 0c 85 c0 0f 84 df 00 00 00 31 db 65
RSP: 0018:ffffa7400532b9c0 EFLAGS: 00010286
RAX: ffff978e2ae25800 RBX: ffff97910ec12058 RCX: ffff978e12caac70
RDX: 0000000080000010 RSI: 0000000000000000 RDI: ffff97912c3d99c0
RBP: ffff97912c3d99c0 R08: 0000000000000000 R09: 0000000070b3a000
R10: 0000000000000002 R11: 0000000000000000 R12: ffff97912c3d99c0
R13: 0000000000000000 R14: ffffa7400532ba90 R15: ffff978e182c6350
FS: 00007f070bb1b640(0000) GS:ffff979509200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000060 CR3: 00000001f0cd2000 CR4: 0000000000350ee0
Call Trace:
ttm_tt_populate+0xa9/0xe0 [ttm]
ttm_bo_handle_move_mem+0x142/0x180 [ttm]
ttm_bo_validate+0x12e/0x1c0 [ttm]
You said that I need open up a bug report you means site
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C75040f5053404b0f302b08d8b666769b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637459898491581880%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IbkSfHK%2BD13OCcYMg%2BlNsZixi9gDEQEfS7Mxyf7vGdM%3D&reserved=0 ?
I thought mailing lists is better because bug report on
bugzilla.kernel.org usually leave opened for several years without
attention.