Re: [PATCH net 1/2] amd-xgbe: fix sleep while atomic on suspend/resume

From: Rangoju, Raju

Date: Thu Feb 26 2026 - 10:57:30 EST




On 2/26/2026 6:07 PM, Simon Horman wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


On Wed, Feb 25, 2026 at 04:30:00PM +0530, Raju Rangoju wrote:
The xgbe_powerdown() and xgbe_powerup() functions use spinlocks
(spin_lock_irqsave) while calling functions that may sleep:
- napi_disable() can sleep waiting for NAPI polling to complete
- flush_workqueue() can sleep waiting for pending work items

This causes a "BUG: scheduling while atomic" error during suspend/resume
cycles on systems using the AMD XGBE Ethernet controller.

The spinlock protection in these functions is unnecessary because:
1. The functions are called from suspend/resume paths which are already
serialized by the PM core
2. The caller parameter was used to differentiate contexts, but the
only current usage is from the driver context (suspend/resume)
3. The power_down flag provides sufficient synchronization

Fix this by:
- Removing the spinlock from xgbe_powerdown() and xgbe_powerup()
- Simplifying the function signatures by removing the unused caller
parameter
- Removing the unused XGMAC_DRIVER_CONTEXT and XGMAC_IOCTL_CONTEXT macros
- Reordering operations in xgbe_powerdown() to disable NAPI before
stopping TX/RX (matching the order used in xgbe_stop())

I don't think that all of these changes are necessary to fix the issue at hand.
If so, please separate the fix(es) from other changes. And submit only
the fixes to net - ideally one patch per fix if there is more than one
discrete fix.

OTOH, enhancements and clean-ups should be submitted to net-next.

Sure, I'll separate the fixes from cleanup and submit.
Since the clean-ups have some dependency on fixes, will let the fixes go first.

If there are dependencies on or conflicts with the fixes, then let
them go into net first. net is merged into net-next each Thursday or Friday.


Fixes: c5aa9e3b8156 ("amd-xgbe: Initial AMD 10GbE platform driver")
Signed-off-by: Raju Rangoju <Raju.Rangoju@xxxxxxx>

...

--
pw-bot: changes-requested