Re: cpu_rmap maps CPUs to wrong interrupts after reprogramming affinities

From: Ben Hutchings
Date: Thu Dec 19 2024 - 17:25:41 EST


On Fri, 2024-12-13 at 10:18 -0800, Caleb Sander wrote:
> Hi netdev,
> While testing ARFS, we found set_rps_cpu() was calling
> ndo_rx_flow_steer() with an RX queue that was not affinitized to the
> desired CPU. The issue occurred only after modifying interrupt
> affinities. It looks to be a bug in cpu_rmap, where cpu_rmap_update()
> can leave CPUs mapped to interrupts which are no longer the most
> closely affinitized to them.
>
> Here is the simplest scenario:
> 1. A network device has 2 IRQs, 1 and 2. Initially only CPU A is
> available to process the network device. So both IRQs 1 and 2 are
> affinitized to CPU A.
[...]

This seems like a misconfiguration: there shouldn't be more RX queues
than CPUs to handle them. I probably never considered it when
implementing cpu_rmap. Still, I agree that this could happen as a
transitory state, and the reverse-map ought to become sensible once all
the RX queues are assigned to different CPUs.

But I haven't looked at that code for over a decade, so I'm probably
not the right person to address this now.

Ben.

--
Ben Hutchings
The world is coming to an end. Please log off.

Attachment: signature.asc
Description: This is a digitally signed message part