Hi,
On Thu, Aug 13, 2020 at 3:09 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
This is how I've always been told that the API works and there are atSpecifically the problem we're trying to address is when an IRQ isirq_disable() has two operating modes:
marked as "disabled" (driver called disable_irq()) but also marked as
"wakeup" (driver called enable_irq_wake()). As per my understanding,
this means:
* Don't call the interrupt handler for this interrupt until I call
enable_irq() but keep tracking it (either in hardware or in software).
Specifically it's a requirement that if the interrupt fires one or
more times while masked the interrupt handler should be called as soon
as enable_irq() is called.
1) Immediately mask the interrupt at the irq chip level
2) Software disable it. If an interrupt is raised while disabled
then the flow handler observes disabled state, masks it, marks it
pending and returns without invoking any device handler.
On a subsequent irq_enable() the interrupt is unmasked if it was masked
and if the interrupt is marked pending and the interrupt is not level
type then it's attempted to retrigger it. Either in hardware or by a
software replay mechanism.
* If this interrupt fires while the system is suspended then pleaseWell, that's kinda contradicting itself. If the interrupt is masked then
wake the system up.
what is the point? I'm surely missing something subtle here.
least a handful of drivers in the kernel whose suspend routines both
enable wakeup and call disable_irq(). Isn't this also documented as
of commit f9f21cea3113 ("genirq: Clarify that irq wake state is
orthogonal to enable/disable")?
This sounds wonderful to me. Maulik: I think you could replace quiteOn some (many?) interrupt controllers a masked interrupt won't wakeSo far nobody told me about this until now, but why exactly do we need
the system up. Thus we need some point in time where the interrupt
controller can unmask interrupts in hardware so that they can act as
wakeups.
yet another unspecified callback instead of simply telling the core via
an irq chip flag that it should always unmask the interrupt if it is a
wakeup source?
Also: if an interrupt was masked lazily this could be a goodSetting IRQCHIP_MASK_ON_SUSPEND does exactly that. No need for a chip
time to ensure that these interrupts _won't_ wake the system up.
driver to do any magic. You just have to use it.
So the really obvious counterpart for this is to have:
IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND
and then do:
@@ -81,6 +81,8 @@ static bool suspend_device_irq(struct ir
* IRQD_WAKEUP_ARMED is visible before we return from
* suspend_device_irqs().
*/
+ if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
+ unmask_irq(desc);
return true;
}
plus the counterpart in the resume path. This also ensures that state is
consistent.
a few of the patches in the series and just use that.
The magic behind the back of the core code unmask brings core state andNot all of us have the big picture that you do to know how things
hardware state out of sync. So if for whatever reason the interrupt is
raised in the CPU before the resume path can mask it again, then the
flow handler will see disabled state, invoke mask_irq() which does
nothing because core state is masked and if that's a level irq it will
come back forever.
Thus the point of these callbacks is to provide a hook for IRQ chipsSee above.
to do this. Now that you understand the motivation perhaps you can
suggest a better way to accomplish this if the approach in this patch
is not OK.
I will note that a quick audit of existing users of the gernic-chip'sIf that's the main problem which is solved in these callbacks, then I
irq_suspend() show that they are doing exactly this. So the point of
my patch is to actually allow other IRQ chips (ones that aren't using
generic-chip) to do this type of thing. At the same time my patch
provides a way for current users of generic-chip to adapt their
routines so they work without syscore (which, I guess, isn't
compatible with s2idle).
really have to ask why this has not been raised years ago. Why can't
people talk?
ought to work, I guess. If nothing else someone looking at this
problem would think: "this must be a common problem, let's go see how
all the other places do it" and then they find how everyone else is
doing it and do it that way. It requires the grander picture that a
maintainer has in order to say: whoa, everyone's copying the same
hack--let's come up with a better solution.
IIRC back then when the callbacks for GC were added the reason was thatFunny to get yelled at for not providing a detailed enough changelog.
the affected chips needed a way to save and restore the full chip state
because the hardware lost it during suspend. S2idle did not exist back
then at least not in it's current form. Oh well...
But gust replacing them by something which is yet another sinkhole for
horrible hacks behind the core code is not making it any better.
I fear another sweep through the unpleasantries of chip drivers is due
sooner than later. Aside of finding time, I need to find my eyecancer
protection glasses and check my schnaps stock.
For what you are trying to achieve, no. IRQCHIP_MASK_ON_SUSPEND isSo what happens in this case:Ah, so I guess we need to move the call to suspend_one_irq() till
CPU0 CPU1
interrupt suspend_device_irq()
handle() chip->suspend_one()
action() ...
chip->fiddle();
????
after the (potential) call to synchronize_irq() in in
suspend_device_irqs()?
already safe.
If we add IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND then there is no sync
problem either.
Hopefully with the above explanation this makes more sense?At least the explanation helped to understand the problem, while the
changelog was pretty useless in that regard:
"These two callbacks are interesting because sometimes an irq chip
needs to know about suspend/resume."
Really valuable and precise technical information.
Usually people complain that my changelogs are too detailed. Sigh.
But aside of the confusion, even with your explanation of what you areClearly I am not coherent. ;-) My only goal was to help enable
trying to solve, I really want a coherent explanation why this should be
done for any of those:
1) an interrupt which has no action, i.e. an interrupt which has no
active users and is in the worst case completely deactivated or was
never activated to begin with.
In the inactive case it might be in a state where unmask issues an
invalid vector, causes hardware malfunction or hits undefined
software state in the chip drivers in the hierarchy.
If you want to be woken up by irq X, then request irq X which
ensures that irq X is in a usable state at all levels of the
stack. If you call disable_irq() or mark the interrupt with
IRQ_NOAUTOEN, fine, it's still consistent state.
2) interrupts which have no_suspend_depth > 0 which means that
there is an action requested which explicitely says: don't touch me
on suspend.
If that driver invokes disable_irq() then it can keep the pieces.
3) chained interrupts
They are never disabled and never masked. So why would anything
need to be done here?
Side note: they should not exist at all, but that's a different
story.
If you don't have coherent explanations, then please just don't touch
that condition at all.
Hint: "Sometimes a chip needs to know" does not qualify :)
interrupts that were disabled / marked as wakeup (as per above,
documented to be OK) to work on Qualcomm chips. This specifically
affects me because a driver that I need to work (cros_ec) does this.
If IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND is good to add then it sounds like
a great plan to me.
-Doug