Re: [PATCH PREEMPT_RT] kcov: fix locking splat from kcov_remote_start()
From: Sebastian Andrzej Siewior
Date: Wed Aug 11 2021 - 05:00:39 EST
On 2021-08-10 22:38:30 [+0200], Thomas Gleixner wrote:
> On Tue, Aug 10 2021 at 11:50, Sebastian Andrzej Siewior wrote:
> > On 2021-08-09 15:59:09 [-0500], Clark Williams wrote:
> >> Saw the following splat on 5.14-rc4-rt5 with:
> > …
> >> Change kcov_remote_lock from regular spinlock_t to raw_spinlock_t so that
> >> we don't get "sleeping function called from invalid context" on PREEMPT_RT kernel.
> >
> > I'm not entirely happy with that:
> > - kcov_remote_start() decouples spin_lock_irq() and does local_irq_save()
> > + spin_lock() which shouldn't be done as per
> > Documentation/locking/locktypes.rst
> > I would prefer to see the local_irq_save() replaced by
> > local_lock_irqsave() so we get a context on what is going on.
>
> Which does not make it raw unless we create a raw_local_lock.
But why raw? I was thinking about local_lock_irqsave() instead of
local_irq_save() and keeping the spinlock_t.
> > - kcov_remote_reset() has a kfree() with that irq-off lock acquired.
>
> That free needs to move out obviously
>
> > - kcov_remote_add() has a kmalloc() and is invoked with that irq-off
> > lock acquired.
>
> So does the kmalloc.
>
> > - kcov_remote_area_put() uses INIT_LIST_HEAD() for no reason (just
> > happen to notice).
> >
> > - kcov_remote_stop() does local_irq_save() + spin_lock(&kcov->lock);.
> > This should also create a splat.
> >
> > - With lock kcov_remote_lock acquired there is a possible
> > hash_for_each_safe() and list_for_each() iteration. I don't know what
> > the limits are here but with a raw_spinlock_t it will contribute to
> > the maximal latency.
>
> And that matters because? kcov has a massive overhead and with that
> enabled you care as much about latencies as you do when running with
> lockdep enabled.
I wasn't aware of that. However, with that local_irq_save() ->
local_lock_irqsave() swap and that first C code from
Documentation/dev-tools/kcov.rst I don't see any spike in cyclictest's
results. Maybe I'm not using it right…
> Thanks,
>
> tglx
Sebastian