Re: [PATCH v2 06/23] mm: introduce BPF struct ops for OOM handling

Next message: Florian Westphal: "Re: [RFC PATCH 0/5] net: make config options NF_LOG_{ARP,IPV4,IPV6} transitional"
Previous message: Alexei Starovoitov: "Re: [PATCH v5] tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast"
In reply to: Matt Bobrowski: "Re: [PATCH v2 06/23] mm: introduce BPF struct ops for OOM handling"
Next in thread: Matt Bobrowski: "Re: [PATCH v2 06/23] mm: introduce BPF struct ops for OOM handling"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Roman Gushchin

Date: Mon Jan 12 2026 - 12:20:25 EST

Matt Bobrowski <mattbobrowski@xxxxxxxxxx> writes:

> On Mon, Oct 27, 2025 at 04:17:09PM -0700, Roman Gushchin wrote:
>> Introduce a bpf struct ops for implementing custom OOM handling
>> policies.
>>
>> ...
>>
>> +#ifdef CONFIG_MEMCG
>> + /* Find the nearest bpf_oom_ops traversing the cgroup tree upwards */
>> + for (memcg = oc->memcg; memcg; memcg = parent_mem_cgroup(memcg)) {
>> + bpf_oom_ops = READ_ONCE(memcg->bpf_oom);
>> + if (!bpf_oom_ops)
>> + continue;
>> +
>> + /* Call BPF OOM handler */
>> + ret = bpf_ops_handle_oom(bpf_oom_ops, memcg, oc);
>> + if (ret && oc->bpf_memory_freed)
>> + goto exit;
>
> I have a question about the semantics of oc->bpf_memory_freed.
>
> Currently, it seems this flag is used to indicate that a BPF OOM
> program has made forward progress by freeing some memory (i.e.,
> bpf_oom_kill_process()), but if it's not set, it falls back to the
> default in-kernel OOM killer.
>
> However, what if forward progress in some contexts means not freeing
> memory? For example, in some bespoke container environments, the
> policy might be to catch the OOM event and handle it gracefully by
> raising the memory.limit_in_bytes on the affected memcg. In this kind
> of resizing scenario, no memory would be freed, but the OOM event
> would effectively be resolved.

I'd say we need to introduce a special kfunc which increases the limit
and sets bpf_memory_freed. I think it's important to maintain safety
guarantee, so that a faulty bpf program is not leading to the system
being deadlocked on memory.

Thanks!