Re: [PATCH] irq_bcm2836: Send event when onlining sleeping cores
From: Phil Elwell
Date: Wed May 10 2017 - 06:31:52 EST
On 10/05/2017 11:09, Marc Zyngier wrote:
> On 10/05/17 10:05, Phil Elwell wrote:
>> On 10/05/2017 09:55, Marc Zyngier wrote:
>>> On Wed, May 10 2017 at 9:27:10 am BST, Phil Elwell <phil@xxxxxxxxxxxxxxx> wrote:
>>>> On 10/05/2017 08:42, Marc Zyngier wrote:
>>>>> On 09/05/17 20:02, Phil Elwell wrote:
>>>>>> On 09/05/2017 19:53, Marc Zyngier wrote:
>>>>>>> On 09/05/17 19:52, Phil Elwell wrote:
>>>>>>>> On 09/05/2017 19:14, Marc Zyngier wrote:
>>>>>>>>> On 09/05/17 19:08, Eric Anholt wrote:
>>>>>>>>>> Marc Zyngier <marc.zyngier@xxxxxxx> writes:
>>>>>>>>>>
>>>>>>>>>>> On 09/05/17 17:59, Eric Anholt wrote:
>>>>>>>>>>>> Phil Elwell <phil@xxxxxxxxxxxxxxx> writes:
>>>>>>>>>>>>
>>>>>>>>>>>>> In order to reduce power consumption and bus traffic, it is sensible
>>>>>>>>>>>>> for secondary cores to enter a low-power idle state when waiting to
>>>>>>>>>>>>> be started. The wfe instruction causes a core to wait until an event
>>>>>>>>>>>>> or interrupt arrives before continuing to the next instruction.
>>>>>>>>>>>>> The sev instruction sends a wakeup event to the other cores, so call
>>>>>>>>>>>>> it from bcm2836_smp_boot_secondary, the function that wakes up the
>>>>>>>>>>>>> waiting cores during booting.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It is harmless to use this patch without the corresponding change
>>>>>>>>>>>>> adding wfe to the ARMv7/ARMv8-32 stubs, but if the stubs are updated
>>>>>>>>>>>>> and this patch is not applied then the other cores will sleep forever.
>>>>>>>>>>>>>
>>>>>>>>>>>>> See: https://github.com/raspberrypi/linux/issues/1989
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Phil Elwell <phil@xxxxxxxxxxxxxxx>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> drivers/irqchip/irq-bcm2836.c | 3 +++
>>>>>>>>>>>>> 1 file changed, 3 insertions(+)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/irqchip/irq-bcm2836.c b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>> index e10597c..6dccdf9 100644
>>>>>>>>>>>>> --- a/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>> +++ b/drivers/irqchip/irq-bcm2836.c
>>>>>>>>>>>>> @@ -248,6 +248,9 @@ static int __init bcm2836_smp_boot_secondary(unsigned int cpu,
>>>>>>>>>>>>> writel(secondary_startup_phys,
>>>>>>>>>>>>> intc.base + LOCAL_MAILBOX3_SET0 + 16 * cpu);
>>>>>>>>>>>>>
>>>>>>>>>>>>> + dsb(sy); /* Ensure write has completed before waking the other CPUs */
>>>>>>>>>>>>> + sev();
>>>>>>>>>>>>> +
>>>>>>>>>>>>> return 0;
>>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> This is also the behavior that the standard arm64 spin-table
>>>>>>>>>>>> method has,
>>>>>>>>>>>> which we unfortunately can't quite use.
>>>>>>>>>>>
>>>>>>>>>>> And why is that so? Why do you have to reinvent the wheel (and hide the
>>>>>>>>>>> cloned wheel in an interrupt controller driver)?
>>>>>>>>>>>
>>>>>>>>>>> That doesn't seem right to me.
>>>>>>>>>>
>>>>>>>>>> The armv8 stubs (firmware-supplied code in the low page that do the
>>>>>>>>>> spinning) do actually implement arm64's spin-table method. It's the
>>>>>>>>>> armv7 stubs that use these registers in the irqchip instead of plain
>>>>>>>>>> addresses in system memory.
>>>>>>>>>
>>>>>>>>> Let's put ARMv7 aside for the time being. If your firmware already
>>>>>>>>> implements spin-tables, why don't you simply use that at least on arm64?
>>>>>>>>
>>>>>>>> We do.
>>>>>>>
>>>>>>> Obviously not the way it is intended if you have to duplicate the core
>>>>>>> architectural code in the interrupt controller driver, which couldn't
>>>>>>> care less.
>>>>>>
>>>>>> If we were using this method on arm64 then the other cores would not start up
>>>>>> because armstub8.S has always included a wfe. Nothing in the commit mentions
>>>>>> arm64 - this is an ARCH=arm fix.
>>>>>
>>>>> Thanks for the clarification, which you could have added to the commit
>>>>> message.
>>>>>
>>>>> The question still remains: why do we have CPU bring-up code in an
>>>>> interrupt controller, instead of having it in the architecture code?
>>>>>
>>>>> The RPi-2 is the *only* platform to have its SMP bringup code outside of
>>>>> arch/arm, so the first course of action would be to move that code where
>>>>> it belongs.
>>>>
>>>> You were CC'd on the commit (41f4988cc287e5f836d3f6620c9f900bc9b560e9) that
>>>> introduced bcm2836_smp_boot_secondary - it seems strange to start objecting
>>>> now.
>>>
>>> Well, I'm far from being perfect. If I had noticed it, I'd have NACKed
>>> it.
>>>
>>>> Yes, I think it is odd that it didn't go into arch/arm/mach-bcm, but in
>>>> the interests of making changes in small, independent steps, do you have a
>>>> problem with this commit?
>>>
>>> On its own, no. I'm just not keen on adding more unrelated stuff to this
>>> file, so let's start with dealing with the original bug, and you can
>>> then add this fix on top.
>>
>> That's an interesting use of the word "bug". From Wikipedia:
>>
>> "A software bug is an error, flaw, failure or fault in a computer program or
>> system that causes it to produce an incorrect or unexpected result, or to
>> behave in unintended ways."
>
> Whatever. Should I call it "pile of crap dumped in unsuitable locations"
> instead? What does Wikipedia says about it?
>
>> Although your concerns are valid, the faults you are objecting to are not causing
>> a malfunction of any kind. If we were to update the RPi firmware before this
>> patch was merged then upstream users would be left with one wheel on their wagon.
>
> And that'd be your problem, not mine. Look, you can argue around this
> all day, or you can fix this mess. Your choice.
Is that the opinion of all here?
Phil