Re: [PATCH net-next] net: Implement fault injection forcing skb reallocation

From: Pavel Begunkov
Date: Mon Oct 07 2024 - 14:00:12 EST


On 10/7/24 18:09, Breno Leitao wrote:
Hello Pavel,

On Mon, Oct 07, 2024 at 05:48:39PM +0100, Pavel Begunkov wrote:
On 10/7/24 17:20, Breno Leitao wrote:
On Sat, Oct 05, 2024 at 01:38:59PM +0900, Akinobu Mita wrote:
2024年10月2日(水) 20:37 Breno Leitao <leitao@xxxxxxxxxx>:

Introduce a fault injection mechanism to force skb reallocation. The
primary goal is to catch bugs related to pointer invalidation after
potential skb reallocation.

The fault injection mechanism aims to identify scenarios where callers
retain pointers to various headers in the skb but fail to reload these
pointers after calling a function that may reallocate the data. This
type of bug can lead to memory corruption or crashes if the old,
now-invalid pointers are used.

By forcing reallocation through fault injection, we can stress-test code
paths and ensure proper pointer management after potential skb
reallocations.

Add a hook for fault injection in the following functions:

* pskb_trim_rcsum()
* pskb_may_pull_reason()
* pskb_trim()

As the other fault injection mechanism, protect it under a debug Kconfig
called CONFIG_FAIL_SKB_FORCE_REALLOC.

This patch was *heavily* inspired by Jakub's proposal from:
https://lore.kernel.org/all/20240719174140.47a868e6@xxxxxxxxxx/

CC: Akinobu Mita <akinobu.mita@xxxxxxxxx>
Suggested-by: Jakub Kicinski <kuba@xxxxxxxxxx>
Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>

This new addition seems sensible. It might be more useful to have a filter
that allows you to specify things like protocol family.

I think it might make more sense to be network interface specific. For
instance, only fault inject in interface `ethx`.

Wasn't there some error injection infra that allows to optionally
run bpf? That would cover the filtering problem. ALLOW_ERROR_INJECTION,
maybe?

Isn't ALLOW_ERROR_INJECTION focused on specifying which function could
be faulted? I.e, you can mark that function as prone for fail injection?

In my the case I have in mind, I want to pass the interface that it
would have the error injected. For instance, only inject errors in
interface eth1. In this case, I am not sure ALLOW_ERROR_INJECTION will
help.

I've never looked into it and might be wrong, but I view
ALLOW_ERROR_INJECTION'ed functions as a yes/no (err code) switch on
steroids enabling debug code but not doing actual failing. E.g.

if (should_fail_bio(bio)) {
bio->bi_status = status;
bio_endio(bio);
return;
}

Looking at your patch, in this case it'd be not failing a request but
pskb_expand_head(). Not exactly a perfect match as there are no "errors"
here, but if not usable directly maybe it's trivial to adapt.

That's assuming it supports bpf and lets it to specify the result of
the function, from where bpf can dig into the skb argument and do
custom filtering.

--
Pavel Begunkov