Re: [PATCH] clocksource: sh_tmu: Set cpu_possible_mask to fix SMP broadcast

From: Laurent Pinchart
Date: Tue Dec 16 2014 - 19:44:55 EST


Hi Daniel,

On Tuesday 16 December 2014 12:54:56 Daniel Lezcano wrote:
> On 12/16/2014 12:46 PM, Magnus Damm wrote:
> > On Tue, Dec 16, 2014 at 8:20 PM, Laurent Pinchart wrote:
> >> On Tuesday 16 December 2014 12:14:40 Daniel Lezcano wrote:
> >>> On 12/16/2014 10:48 AM, Magnus Damm wrote:
> >>>> From: Magnus Damm <damm+renesas@xxxxxxxxxxxxx>
> >>>>
> >>>> Update the TMU driver to use cpu_possible_mask as cpumask to make
> >>>> r8a7779 SMP work as expected with or without the ARM TWD timer.
> >>>>
> >>>> Signed-off-by: Magnus Damm <damm+renesas@xxxxxxxxxxxxx>
> >>>
> >>> Applied as a 3.18 fix.
> >>
> >> You're a bit too fast, I haven't had time to review the patch yet.
> >>
> >>> ps: May I suggest to use the CLOCK_EVT_FEAT_DYNIRQ flag for this driver
> >>> ?
> >>>
> >>>> ---
> >>>> drivers/clocksource/sh_tmu.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> --- 0001/drivers/clocksource/sh_tmu.c
> >>>> +++ work/drivers/clocksource/sh_tmu.c 2014-12-16
> >>>> 17:49:49.000000000 +0900 @@ -428,7 +428,7 @@ static void
> >>>> sh_tmu_register_clockevent(s
> >>>> ced->features = CLOCK_EVT_FEAT_PERIODIC;
> >>>> ced->features |= CLOCK_EVT_FEAT_ONESHOT;
> >>>> ced->rating = 200;
> >>>> - ced->cpumask = cpumask_of(0);
> >>>> + ced->cpumask = cpu_possible_mask;
> >>
> >> Magnus, how thoroughly have you tested this ? The TMU is indeed usable by
> >> all CPUs, so setting the CPU mask to cpu_possible_mask makes sense, but
> >> last time I've tried that it broke the broadcast timer due to the
> >> heuristics used by the clock events core code.
> >
> > Uhm, so I've tested this particular patch on r8a7779 but I do agree
> > that the TMU is used on a bunch of SoCs if that's what you mean. I
> > don't see how it is different from any other of our timers though, and
> > those got fixed like this earlier.
> >
> > I wonder if you may recall an earlier issue with incorrect clock event
> > priorities and code somehow working-by-accident without the mask set
> > as expected?
>
> Could have been fixed with : ?
>
> commit 70e5975d3a04be5479a28eec4a2fb10f98ad2785
> Author: Stephen Boyd <sboyd@xxxxxxxxxxxxxx>
> Date: Thu Jun 13 11:39:50 2013 -0700
>
> clockevents: Prefer CPU local devices over global devices
>
> On an SMP system with only one global clockevent and a dummy
> clockevent per CPU we run into problems. We want the dummy
> clockevents to be registered as the per CPU tick devices, but
> we can only achieve that if we register the dummy clockevents
> before the global clockevent or if we artificially inflate the
> rating of the dummy clockevents to be higher than the rating
> of the global clockevent. Failure to do so leads to boot
> hangs when the dummy timers are registered on all other CPUs
> besides the CPU that accepted the global clockevent as its tick
> device and there is no broadcast timer to poke the dummy
> devices.
>
> If we're registering multiple clockevents and one clockevent is
> global and the other is local to a particular CPU we should
> choose to use the local clockevent regardless of the rating of
> the device. This way, if the clockevent is a dummy it will take
> the tick device duty as long as there isn't a higher rated tick
> device and any global clockevent will be bumped out into
> broadcast mode, fixing the problem described above.

I think I had that patch in my tree when I noticed the breakage, since my own
tmu patch dates from February 2014. I'll retest it though and will let you
know.

> >> Could you please confirm that you've tested both CONFIG_PREEMPT_NONE and
> >> CONFIG_PREEMPT with and without the ARM TWD times, and that you've booted
> >> to userspace and tested timer broadcast on all CPUs ?
> >
> > No I have not. I've booted to user space in initramfs with DT-based
> > TWD on Multiplatform for r8a7779. Without this fix (and other r8a7779
> > TWD bits) I see a lot of breakage. For instance, TWD and SMP boot is
> > broken on r8a7779 - both legacy and non-legacy. I have not gotten to
> > sh73a0 yet, but I assume it is busted too.
> >
> > Can you please explain to me how the TMU is any different compared to
> > the CMT, MTU2 or STI? =)
> >
> > And no, I don't have any r8a7740 board anymore. Can anyone else test?

--
Regards,

Laurent Pinchart

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/