Re: [PATCH net 6/6] net/ncsi: validate GP payload lengths before parsing

From: Jakub Kicinski

Date: Thu Apr 23 2026 - 15:13:41 EST

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/ncsi: validate GP payload lengths before parsing

This commit bounds MAC and VLAN counts to software and GC-reported limits.
It adds validation to ensure the declared GP payload covers the fixed GP
prefix, consumed entries, and the checksum before parsing filter tables.

> diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> index 94354dca23eac..565d38fd4b92d 100644
> --- a/net/ncsi/ncsi-rsp.c
> +++ b/net/ncsi/ncsi-rsp.c
[ ... ]
> @@ -924,6 +926,14 @@ static int ncsi_rsp_handler_gp(struct ncsi_request *nr)
> if (rsp->mac_cnt > mac_nbits || rsp->vlan_cnt > ncvf->n_vids)
> return -ERANGE;

Does this code read rsp->mac_cnt and rsp->vlan_cnt before validating the
payload length?

If a malformed packet with a payload length of 0 is received,
ncsi_validate_rsp_pkt() guarantees only 20 bytes (the common response
header) are pulled into the skb linear data area.

Since rsp->mac_cnt and rsp->vlan_cnt are at offsets 20 and 24 respectively,
could accessing them prior to the payload length check trigger an
out-of-bounds read?

> + payload = ncsi_rsp_payload(nr->rsp);
> + needed = offsetof(struct ncsi_rsp_gp_pkt, mac) - sizeof(rsp->rsp);
> + needed += mac_cnt * ETH_ALEN;
> + needed += vlan_cnt * sizeof(__be16);
> + needed += sizeof(rsp->checksum);
> + if (payload < needed)
> + return -EINVAL;

Are there missing timer cancellations during NCSI device unregistration
that could lead to a use-after-free regression?

ncsi_register_dev() initializes timers for 256 NCSI requests. If an NCSI
request is pending when the device is unregistered, and
ncsi_unregister_dev() calls kfree(ndp) without a prior timer_delete_sync(),
could the timer fire later and cause ncsi_request_timeout() to dereference
the freed memory?

Is there a synchronization issue between the workqueue teardown order and
device unregistration?

ncsi_unregister_dev() frees all packages and their channels via
ncsi_remove_package() before calling disable_work_sync(&ndp->work).

If the background task ncsi_dev_work() executes concurrently during this
teardown, could it attempt to suspend or configure the active channel
(e.g., in ncsi_suspend_channel()) and access ndp->active_package and
ndp->active_channel after they have already been freed?

Does the RCU synchronization in the package and channel removal paths properly
prevent use-after-free regressions during receive?

ncsi_remove_channel() and ncsi_remove_package() unlink elements from
RCU-protected lists (list_del_rcu(&nc->node)) and immediately call kfree()
without invoking synchronize_rcu() or using kfree_rcu().

Since the Rx path in ncsi_rcv_rsp() concurrently searches these lists using
list_for_each_entry_rcu(), could an incoming packet cause the Rx handler to
traverse and dereference recently freed package or channel memory?

Does ncsi_rsp_handler_gc() leak the MAC and VLAN filter arrays on repeated Get
Capabilities (GC) responses?

The handler unconditionally allocates memory for nc->mac_filter.addrs and
nc->vlan_filter.vids using kzalloc() and kcalloc(), but does not appear to
check for or free any previously allocated memory.

If a channel goes through a reset cycle via ncsi_reset_dev(), or if a GC
command is explicitly injected via Netlink, could the handler execute again
and overwrite the pointers, leaking the previous allocations?