Re: [PATCH] net: enetc: fix sirq-storm by clearing IDR registers

From: Vladimir Oltean

Date: Thu Feb 26 2026 - 14:52:21 EST


On Thu, Feb 26, 2026 at 08:28:53PM +0100, Zefir Kurtisi wrote:
> Thank you for the feedback and clarifications.
>
> The statement that SITXIDR/SIRXIDR bits are directly linked to TBaIDR/RBaIDR
> is missing in the reference manual, i.e. it states that the former represent
> a summary of the latter, but not that W1C-ing the bits in SITXIDR would also
> move the other direction and clear TBaIDR. That clarifies quite a bit,
> thanks.
>
> As for your request to take the mainline branch, I am depending on a OpenWRT
> build-system and shifting linux kernel-versions is unfortunately not
> something I can do in no time. As for the potentially missing patch you
> pointed me to, I backported that one, but it makes no difference.
>
> Luckily meanwhile I was able to narrow down the issue and can provide you a
> means to hopefully reproduce it. This is the tl;dr version:
>
> * enetc operates eth0
> * ath9k operates wlan0
> * both are bridged over OVS
> * device is AP with an active STA connected to it
> * STA regularly sends an L2 WNM keep-alive frame
> * that frame is 'buggy' as being tagged IPv4 but without payload
> * through the OVS bridge that frame makes it into eth0 TX path
> * enetc_start_xmit() enqueues it into TX-BD
> * HW processes that descriptor, sets IDR and issues interrupt
> * enetc_clean_tx_ring()
> * gets a bds_to_clean=0 (tx_ring->tcir = tx_ring->next_to_clean)
> * i.e. HW signals it completed the BD but did not advance TCIR
> * skips the while() loop
> * and hence never clears the according SITXIDR bit
> * enetc_poll() after completion of ring processing
> * re-enables interrupts
> * but the one bit in SITXIDR is now sticky
> * interrupt is re-asserted immediately
> * the affected core remains 100% SIRQing
> * it only recovers when the affected TX ring advances
>
> So in short, enetc breaks when sending 0-byte frames.
>
> The patch that I provided resolves the problem by force-cleaning all IDRs
> before interrupts are re-enabled. That is the sledge-hammer approach, since
> it also unmasks BDs that were just completed during
> execution of enetc_poll() or no_eof BDs. Hence it is not the final
> solution, but currently anything is better than a freezing box.
>
> Below is the tool I wrote to fire such a frame-of-death. If you can
> reproduce the observation, I'd prepare a v2 patch to unblock the issue once
> it happens - preventing enetc_start_xmit() from sending such frames I'd
> leave to you, since that part looks complex to me to handle it properly.
>
>
> Cheers,
> Zefir
>
> ---
>
> #include <stdio.h>
> #include <string.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/socket.h>
> #include <net/ethernet.h>
> #include <netpacket/packet.h>
> #include <net/if.h>
> #include <arpa/inet.h>
> #include <sys/ioctl.h>
>
> /*
> * Enetc-Killer
> *
> * This is a PoC for fsl_enetc Ethernet driver to detect an
> * issue the driver has when zero-payload IP packets are sent.
> *
> * It was detected when using an enetc Ethernet interface bridged
> * with a wireless interface operating as AP. A connected client
> * regularly sends L2 WNM keep-alive frames without IP payload.
> * Through the bridge this 'buggy' packet makes it into the
> * enetc TX path, which the driver enqueues for sending and
> * the HW signals transmission done but without providing a
> * completed TX-BD. This leads to a sticky interrupt detected
> * flag causing a SIRQ-storm.
> *
> * This has been tested on a LS1028A based system under an
> * OpenWRT derivative / linux 6.6.93
> *
> * To test:
> * * build and copy binary to device
> * * connect over serial, leave eth0 idle
> * * ensure device runs with multiple cores enabled (otherwise it freezes)
> * * run the program
> * * with top, observe that one core is fully loaded with SIRQ
> * * to recover, storm-ping eth0 from outside to
> * enforce TX-BD advance
> */
>
> int main()
> {
> int sock = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
> if (sock < 0) {
> perror("socket");
> return 1;
> }
>
> struct ifreq ifr;
> memset(&ifr, 0, sizeof(ifr));
> strncpy(ifr.ifr_name, "eth0", IFNAMSIZ);
> if (ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
> perror("ioctl");
> return 1;
> }
>
> struct sockaddr_ll addr = { 0 };
> addr.sll_family = AF_PACKET;
> addr.sll_ifindex = ifr.ifr_ifindex;
> addr.sll_halen = ETH_ALEN;
> addr.sll_protocol = htons(ETH_P_IP);
> // Destination MAC (Broadcast)
> addr.sll_addr[0] = 0xff;
> addr.sll_addr[1] = 0xff;
> addr.sll_addr[2] = 0xff;
> addr.sll_addr[3] = 0xff;
> addr.sll_addr[4] = 0xff;
> addr.sll_addr[5] = 0xff;
>
> // "broken" packet: only Ethernet-header, no IP-payload
> // as sent by wpa_supplicant as L2 WNM keep-alive frame
> unsigned char buf[14] = {
> 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, // DST MAC
> 0x00, 0x11, 0x22, 0x33, 0x44, 0x55, // SRC MAC
> 0x08, 0x00 // EtherType = IPv4
> };
>
> if (sendto(sock, buf, sizeof(buf), 0, (struct sockaddr*) &addr,
> sizeof(addr)) < 0) {
> perror("sendto");
> return 1;
> }
> close(sock);
> return 0;
> }
>

If I understand correctly, this patch should resolve your issue?