Re: CVE-2024-50219: mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves

From: Vlastimil Babka
Date: Mon Nov 11 2024 - 09:23:12 EST


On 11/11/24 15:04, Greg Kroah-Hartman wrote:
> On Mon, Nov 11, 2024 at 11:40:49AM +0100, Vlastimil Babka wrote:
>> On 11/9/24 11:15, Greg Kroah-Hartman wrote:
>> > Description
>> > ===========
>> >
>> > In the Linux kernel, the following vulnerability has been resolved:
>> >
>> > mm/page_alloc: let GFP_ATOMIC order-0 allocs access highatomic reserves
>> >
>> > Under memory pressure it's possible for GFP_ATOMIC order-0 allocations to
>> > fail even though free pages are available in the highatomic reserves.
>> > GFP_ATOMIC allocations cannot trigger unreserve_highatomic_pageblock()
>> > since it's only run from reclaim.
>> >
>> > Given that such allocations will pass the watermarks in
>> > __zone_watermark_unusable_free(), it makes sense to fallback to highatomic
>> > reserves the same way that ALLOC_OOM can.
>> >
>> > This fixes order-0 page allocation failures observed on Cloudflare's fleet
>> > when handling network packets:
>>
>> Hi,
>>
>> I would like to dispute the CVE. GFP_ATOMIC page allocations failures can
>> generally happen (typically from network receive path, like here) and should
>> always have a fallback. The impact could be somewhat worse performance at
>> worst. AFAIK they are not affected by panic_on_warn nor panic_on_oom either,
>> it's a pr_warn(), so I don't think there's a DoS vector.
>
> I read this as "there was a failure, with no fallback", but in looking
> at the traceback:
>
>> > kswapd1: page allocation failure: order:0, mode:0x820(GFP_ATOMIC),
>> > nodemask=(null),cpuset=/,mems_allowed=0-7
>> > CPU: 10 PID: 696 Comm: kswapd1 Kdump: loaded Tainted: G O 6.6.43-CUSTOM #1
>> > Hardware name: MACHINE
>> > Call Trace:
>> > <IRQ>
>> > dump_stack_lvl+0x3c/0x50
>> > warn_alloc+0x13a/0x1c0
>> > __alloc_pages_slowpath.constprop.0+0xc9d/0xd10
>> > __alloc_pages+0x327/0x340
>> > __napi_alloc_skb+0x16d/0x1f0
>
> This function DOES have a fallback if this failed, so it's ok here.
> Many other ATOMIC allocations in the kernel do not have fallbacks, which
> would cause a crash.

That would make them buggy and those should either use __GFP_NOFAIL or
handle the failure gracefully.

> Note, it is setting the NOWARN flag, so shouldn't this not be warning?

Hmm seems it does now, but AFAICS it's since 6e9b01909a81 ("net: remove
gfp_mask from napi_alloc_skb()") (since v6.10?) but the report above was
from 6.6.
Wasn't aware they decided to silence the warnings, I thought it was
intentional to make the admin aware they might need to increase the atomic
memory reserves for better performance (it's not trivial to tune AFAIK,
depends on bursty arrivals of packets etc). Also if there's a suboptimal
behavior in the implementation like the one commit fixed, it won't be so
easy to spot it anymore. On the other hand, less need to explain the
occasional warning :)

> Anyway, you are right, I'll go reject this one, thanks for the review!

Thanks!
Vlastimil

>
> thanks,
>
> greg k-h