Re: [PATCH] irq_bcm2836: Send event when onlining sleeping cores

From: Marc Zyngier
Date: Wed May 10 2017 - 14:23:26 EST


On Wed, May 10 2017 at 7:02:44 pm BST, Eric Anholt <eric@xxxxxxxxxx> wrote:
> Florian Fainelli <f.fainelli@xxxxxxxxx> writes:
>
>> On 05/10/2017 03:31 AM, Phil Elwell wrote:
>>> On 10/05/2017 11:09, Marc Zyngier wrote:
>>>> On 10/05/17 10:05, Phil Elwell wrote:
>>>>> On 10/05/2017 09:55, Marc Zyngier wrote:
>>>>>> On Wed, May 10 2017 at 9:27:10 am BST, Phil Elwell
>>>>>> <phil@xxxxxxxxxxxxxxx> wrote:
>>>>>>> On 10/05/2017 08:42, Marc Zyngier wrote:
>>>>>>>> On 09/05/17 20:02, Phil Elwell wrote:
>>>>>>>>> On 09/05/2017 19:53, Marc Zyngier wrote:
>>>>>>>>>> On 09/05/17 19:52, Phil Elwell wrote:
>>>>>>>>>>> On 09/05/2017 19:14, Marc Zyngier wrote:
>>>>>>>>>>>> On 09/05/17 19:08, Eric Anholt wrote:
>>>>>>>>>>>>> Marc Zyngier <marc.zyngier@xxxxxxx> writes:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 09/05/17 17:59, Eric Anholt wrote:
>>>>>>>>>>>>>>> Phil Elwell <phil@xxxxxxxxxxxxxxx> writes:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In order to reduce power consumption and bus traffic, it is sensible
>>>>>>>>>>>>>>>> for secondary cores to enter a low-power idle state when waiting to
>>>>>>>>>>>>>>>> be started. The wfe instruction causes a core to wait until an event
>>>>>>>>>>>>>>>> or interrupt arrives before continuing to the next instruction.
>>>>>>>>>>>>>>>> The sev instruction sends a wakeup event to the other cores, so call
>>>>>>>>>>>>>>>> it from bcm2836_smp_boot_secondary, the function that wakes up the
>>>>>>>>>>>>>>>> waiting cores during booting.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It is harmless to use this patch without the corresponding change
>>>>>>>>>>>>>>>> adding wfe to the ARMv7/ARMv8-32 stubs, but if the stubs are updated
>>>>>>>>>>>>>>>> and this patch is not applied then the other cores will sleep forever.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> See: https://github.com/raspberrypi/linux/issues/1989
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Phil Elwell <phil@xxxxxxxxxxxxxxx>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>> drivers/irqchip/irq-bcm2836.c | 3 +++
>>>>>>>>>>>>>>>> 1 file changed, 3 insertions(+)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/drivers/irqchip/irq-bcm2836.c b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>>>>> index e10597c..6dccdf9 100644
>>>>>>>>>>>>>>>> --- a/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>>>>> +++ b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>>>>> @@ -248,6 +248,9 @@ static int __init bcm2836_smp_boot_secondary(unsigned int cpu,
>>>>>>>>>>>>>>>> writel(secondary_startup_phys,
>>>>>>>>>>>>>>>> intc.base + LOCAL_MAILBOX3_SET0 + 16 * cpu);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> + dsb(sy); /* Ensure write has completed before waking the other CPUs */
>>>>>>>>>>>>>>>> + sev();
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> return 0;
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is also the behavior that the standard arm64 spin-table
>>>>>>>>>>>>>>> method has,
>>>>>>>>>>>>>>> which we unfortunately can't quite use.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And why is that so? Why do you have to reinvent the
>>>>>>>>>>>>>> wheel (and hide the
>>>>>>>>>>>>>> cloned wheel in an interrupt controller driver)?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That doesn't seem right to me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The armv8 stubs (firmware-supplied code in the low page that do the
>>>>>>>>>>>>> spinning) do actually implement arm64's spin-table
>>>>>>>>>>>>> method. It's the
>>>>>>>>>>>>> armv7 stubs that use these registers in the irqchip
>>>>>>>>>>>>> instead of plain
>>>>>>>>>>>>> addresses in system memory.
>>>>>>>>>>>>
>>>>>>>>>>>> Let's put ARMv7 aside for the time being. If your firmware already
>>>>>>>>>>>> implements spin-tables, why don't you simply use that at
>>>>>>>>>>>> least on arm64?
>>>>>>>>>>>
>>>>>>>>>>> We do.
>>>>>>>>>>
>>>>>>>>>> Obviously not the way it is intended if you have to duplicate the core
>>>>>>>>>> architectural code in the interrupt controller driver, which couldn't
>>>>>>>>>> care less.
>>>>>>>>>
>>>>>>>>> If we were using this method on arm64 then the other cores
>>>>>>>>> would not start up
>>>>>>>>> because armstub8.S has always included a wfe. Nothing in the
>>>>>>>>> commit mentions
>>>>>>>>> arm64 - this is an ARCH=arm fix.
>>>>>>>>
>>>>>>>> Thanks for the clarification, which you could have added to the commit
>>>>>>>> message.
>>>>>>>>
>>>>>>>> The question still remains: why do we have CPU bring-up code in an
>>>>>>>> interrupt controller, instead of having it in the architecture code?
>>>>>>>>
>>>>>>>> The RPi-2 is the *only* platform to have its SMP bringup code outside of
>>>>>>>> arch/arm, so the first course of action would be to move that code where
>>>>>>>> it belongs.
>>>>>>>
>>>>>>> You were CC'd on the commit
>>>>>>> (41f4988cc287e5f836d3f6620c9f900bc9b560e9) that
>>>>>>> introduced bcm2836_smp_boot_secondary - it seems strange to
>>>>>>> start objecting
>>>>>>> now.
>>>>>>
>>>>>> Well, I'm far from being perfect. If I had noticed it, I'd have NACKed
>>>>>> it.
>>>>>>
>>>>>>> Yes, I think it is odd that it didn't go into arch/arm/mach-bcm, but in
>>>>>>> the interests of making changes in small, independent steps, do
>>>>>>> you have a
>>>>>>> problem with this commit?
>>>>>>
>>>>>> On its own, no. I'm just not keen on adding more unrelated stuff to this
>>>>>> file, so let's start with dealing with the original bug, and you can
>>>>>> then add this fix on top.
>>>>>
>>>>> That's an interesting use of the word "bug". From Wikipedia:
>>>>>
>>>>> "A software bug is an error, flaw, failure or fault in a computer
>>>>> program or
>>>>> system that causes it to produce an incorrect or unexpected result, or to
>>>>> behave in unintended ways."
>>>>
>>>> Whatever. Should I call it "pile of crap dumped in unsuitable locations"
>>>> instead? What does Wikipedia says about it?
>>>>
>>>>> Although your concerns are valid, the faults you are objecting to
>>>>> are not causing
>>>>> a malfunction of any kind. If we were to update the RPi firmware
>>>>> before this
>>>>> patch was merged then upstream users would be left with one wheel
>>>>> on their wagon.
>>>>
>>>> And that'd be your problem, not mine. Look, you can argue around this
>>>> all day, or you can fix this mess. Your choice.
>>>
>>> Is that the opinion of all here?
>>
>> The choice of word here got largely out of the original topic and I
>> surely did eat a ton of popcorn here. There are two things that need
>> fixing, and the time line and process for fixing these is clear:
>>
>> - your bugfix (Phil) is something that should be applied now, and
>> backported to -stable trees once the fix hits the irqchip tree (or Linus')
>>
>> - relocating the code that does the secondary boot out of
>> drivers/irqchip/ into arch/arm/mach-bcm/ needs to happen (Marc), and
>> this is 4.13 material, there is no urgency in doing this *right now*,
>> but it needs to happen
>>
>> Does that work for everyone?
>
> Agreed. This patch, which we'll want to go to -stable, should clearly
> go in first. Marc's patch can go in after, since it's not a -stable
> candidate.
>
> Thomas, could you add the cc to stable when picking this patch?

So, for the record, I'm clearly NAKing this patch. Thomas or Jason can
pick it if they want to, I definitely won't.

M.
--
Jazz is not dead, it just smell funny.