Re: [PATCH v4 09/10] drivers: qcom: rpmh-rsc: Caller handles tcs_invalidate() exclusivity
From: Stephen Boyd
Date: Mon Apr 13 2020 - 19:03:12 EST
Quoting Douglas Anderson (2020-04-13 10:04:14)
> Auditing tcs_invalidate() made me worried. Specifically I saw that it
> used spin_lock(), not spin_lock_irqsave(). That always worries me
> unless I can trace for sure that I'm in the interrupt handler or that
> someone else already disabled interrupts.
>
> Looking more at it, there is actually no reason for these locks
> anyway. Specifically the only reason you'd ever call
> rpmh_rsc_invalidate() is if you cared that the sleep/wake TCSes were
> empty. That means that they need to continue to be empty even after
> rpmh_rsc_invalidate() returns. The only way that can happen is if the
> caller already has done something to keep all other RPMH users out.
> It should be noted that even though the caller is only worried about
> making sleep/wake TCSes empty, they also need to worry about stopping
> active-only transfers if they need to handle the case where
> active-only transfers might borrow the wake TCS.
>
> At the moment rpmh_rsc_invalidate() is only called in PM code from the
> last CPU. If that later changes the caller will still need to solve
> the above problems themselves, so these locks will never be useful.
>
> Continuing to audit tcs_invalidate(), I found a bug. The function
> didn't properly check for a borrowed TCS if we hadn't recently written
> anything into the TCS. Specifically, if we've never written to the
> WAKE_TCS (or we've flushed it recently) then tcs->slots is empty.
> We'll early-out and we'll never call tcs_is_free().
>
> I thought about fixing this bug by either deleting the early check for
> bitmap_empty() or possibly only doing it if we knew we weren't on a
> TCS that could be borrowed. However, I think it's better to just
> delete the checks.
>
> As argued above it's up to the caller to make sure that all other
> users of RPMH are quiet before tcs_invalidate() is called. Since
> callers need to handle the zero-active-TCS case anyway that means they
> need to make sure that the active-only transfers are quiet before
> calling too. The one way tcs_invalidate() gets called today is
> through rpmh_rsc_cpu_pm_callback() which calls
> rpmh_rsc_ctrlr_is_busy() to handle this. When we have another path to
> get to tcs_invalidate() it will also need to come up with something
> similar and it won't need this extra check either. If we later find
> some code path that actually needs this check back in (and somehow
> manages to be race free) we can always add it back in.
>
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> Reviewed-by: Maulik Shah <mkshah@xxxxxxxxxxxxxx>
> Tested-by: Maulik Shah <mkshah@xxxxxxxxxxxxxx>
> ---
Reviewed-by: Stephen Boyd <swboyd@xxxxxxxxxxxx>