Introduce a new clock gating lock to serialize access to some of the
clock gating members instead of the host_lock.
While at it, simplify the code with the guard() macro and co for
automatic cleanup of the new lock. There are some explicit
spin_lock_irqsave/spin_unlock_irqrestore snaking instances I left behind
because I couldn't make heads or tails of it.
Additionally, move the trace_ufshcd_clk_gating() call from inside the
region protected by the lock as it doesn't needs protection.