Re: [PATCH] irq_bcm2836: Send event when onlining sleeping cores

From: Phil Elwell
Date: Wed May 10 2017 - 05:05:35 EST


On 10/05/2017 09:55, Marc Zyngier wrote:
> On Wed, May 10 2017 at 9:27:10 am BST, Phil Elwell <phil@xxxxxxxxxxxxxxx> wrote:
>> On 10/05/2017 08:42, Marc Zyngier wrote:
>>> On 09/05/17 20:02, Phil Elwell wrote:
>>>> On 09/05/2017 19:53, Marc Zyngier wrote:
>>>>> On 09/05/17 19:52, Phil Elwell wrote:
>>>>>> On 09/05/2017 19:14, Marc Zyngier wrote:
>>>>>>> On 09/05/17 19:08, Eric Anholt wrote:
>>>>>>>> Marc Zyngier <marc.zyngier@xxxxxxx> writes:
>>>>>>>>
>>>>>>>>> On 09/05/17 17:59, Eric Anholt wrote:
>>>>>>>>>> Phil Elwell <phil@xxxxxxxxxxxxxxx> writes:
>>>>>>>>>>
>>>>>>>>>>> In order to reduce power consumption and bus traffic, it is sensible
>>>>>>>>>>> for secondary cores to enter a low-power idle state when waiting to
>>>>>>>>>>> be started. The wfe instruction causes a core to wait until an event
>>>>>>>>>>> or interrupt arrives before continuing to the next instruction.
>>>>>>>>>>> The sev instruction sends a wakeup event to the other cores, so call
>>>>>>>>>>> it from bcm2836_smp_boot_secondary, the function that wakes up the
>>>>>>>>>>> waiting cores during booting.
>>>>>>>>>>>
>>>>>>>>>>> It is harmless to use this patch without the corresponding change
>>>>>>>>>>> adding wfe to the ARMv7/ARMv8-32 stubs, but if the stubs are updated
>>>>>>>>>>> and this patch is not applied then the other cores will sleep forever.
>>>>>>>>>>>
>>>>>>>>>>> See: https://github.com/raspberrypi/linux/issues/1989
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Phil Elwell <phil@xxxxxxxxxxxxxxx>
>>>>>>>>>>> ---
>>>>>>>>>>> drivers/irqchip/irq-bcm2836.c | 3 +++
>>>>>>>>>>> 1 file changed, 3 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/irqchip/irq-bcm2836.c b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>> index e10597c..6dccdf9 100644
>>>>>>>>>>> --- a/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>> +++ b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>> @@ -248,6 +248,9 @@ static int __init bcm2836_smp_boot_secondary(unsigned int cpu,
>>>>>>>>>>> writel(secondary_startup_phys,
>>>>>>>>>>> intc.base + LOCAL_MAILBOX3_SET0 + 16 * cpu);
>>>>>>>>>>>
>>>>>>>>>>> + dsb(sy); /* Ensure write has completed before waking the other CPUs */
>>>>>>>>>>> + sev();
>>>>>>>>>>> +
>>>>>>>>>>> return 0;
>>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> This is also the behavior that the standard arm64 spin-table
>>>>>>>>>> method has,
>>>>>>>>>> which we unfortunately can't quite use.
>>>>>>>>>
>>>>>>>>> And why is that so? Why do you have to reinvent the wheel (and hide the
>>>>>>>>> cloned wheel in an interrupt controller driver)?
>>>>>>>>>
>>>>>>>>> That doesn't seem right to me.
>>>>>>>>
>>>>>>>> The armv8 stubs (firmware-supplied code in the low page that do the
>>>>>>>> spinning) do actually implement arm64's spin-table method. It's the
>>>>>>>> armv7 stubs that use these registers in the irqchip instead of plain
>>>>>>>> addresses in system memory.
>>>>>>>
>>>>>>> Let's put ARMv7 aside for the time being. If your firmware already
>>>>>>> implements spin-tables, why don't you simply use that at least on arm64?
>>>>>>
>>>>>> We do.
>>>>>
>>>>> Obviously not the way it is intended if you have to duplicate the core
>>>>> architectural code in the interrupt controller driver, which couldn't
>>>>> care less.
>>>>
>>>> If we were using this method on arm64 then the other cores would not start up
>>>> because armstub8.S has always included a wfe. Nothing in the commit mentions
>>>> arm64 - this is an ARCH=arm fix.
>>>
>>> Thanks for the clarification, which you could have added to the commit
>>> message.
>>>
>>> The question still remains: why do we have CPU bring-up code in an
>>> interrupt controller, instead of having it in the architecture code?
>>>
>>> The RPi-2 is the *only* platform to have its SMP bringup code outside of
>>> arch/arm, so the first course of action would be to move that code where
>>> it belongs.
>>
>> You were CC'd on the commit (41f4988cc287e5f836d3f6620c9f900bc9b560e9) that
>> introduced bcm2836_smp_boot_secondary - it seems strange to start objecting
>> now.
>
> Well, I'm far from being perfect. If I had noticed it, I'd have NACKed
> it.
>
>> Yes, I think it is odd that it didn't go into arch/arm/mach-bcm, but in
>> the interests of making changes in small, independent steps, do you have a
>> problem with this commit?
>
> On its own, no. I'm just not keen on adding more unrelated stuff to this
> file, so let's start with dealing with the original bug, and you can
> then add this fix on top.

That's an interesting use of the word "bug". From Wikipedia:

"A software bug is an error, flaw, failure or fault in a computer program or
system that causes it to produce an incorrect or unexpected result, or to
behave in unintended ways."

Although your concerns are valid, the faults you are objecting to are not causing
a malfunction of any kind. If we were to update the RPi firmware before this
patch was merged then upstream users would be left with one wheel on their wagon.

Phil