Re: [bug] 5.11-rc5 brought page allocation failure issue [ttm][amdgpu]

From: Christian König
Date: Sun Jan 31 2021 - 15:14:44 EST


Am 31.01.21 um 02:03 schrieb David Rientjes:
On Sat, 30 Jan 2021, David Rientjes wrote:

On Sun, 31 Jan 2021, Mikhail Gavrilov wrote:

The 5.11-rc5 (git 76c057c84d28) brought a new issue.
Now the kernel log is flooded with the message "page allocation failure".

Trace:
msedge:cs0: page allocation failure: order:10,
Order-10, wow!

ttm_pool_alloc() will start at order-10 and back off trying smaller orders
if necessary. This is a regression introduced in

commit bf9eee249ac2032521677dd74e31ede5429afbc0
Author: Christian König <christian.koenig@xxxxxxx>
Date: Wed Jan 13 14:02:04 2021 +0100

drm/ttm: stop using GFP_TRANSHUGE_LIGHT

Namely, it removed the __GFP_NOWARN that we otherwise require. I'll send
a patch in reply.

Looks like Michel Dänzer <michel@xxxxxxxxxxx> already sent a patch that
should fix this:
https://lore.kernel.org/lkml/20210128095346.2421-1-michel@xxxxxxxxxxx/

Yeah, known issue. I already pushed Michel's fix to drm-misc-fixes. Should land in the next -rc by the weekend.

Regards,
Christian.