Re: [PATCH 1/2] mux: add mux_chip_resume() function

From: Thomas Richard
Date: Fri Sep 06 2024 - 12:17:39 EST


On 9/5/24 10:28, Peter Rosin wrote:
> Hi!
>
> 2024-09-04 at 18:07, Thomas Richard wrote:
>> On 9/4/24 14:52, Peter Rosin wrote:
>>> Hi!
>>>
>>> 2024-09-04 at 13:29, Thomas Richard wrote:
>>>> On 9/4/24 11:32, Peter Rosin wrote:
>>>>> Hi!
>>>>>
>>>>> 2024-09-04 at 11:18, Thomas Richard wrote:
>>>>>> On 9/3/24 15:22, Peter Rosin wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> Sorry for being unresponsive. And for first writing this in the older v4
>>>>>>> thread instead of here.
>>>>>>>
>>>>>>> 2024-06-13 at 15:07, Thomas Richard wrote:
>>>>>>>> The mux_chip_resume() function restores a mux_chip using the cached state
>>>>>>>> of each mux.
>>>>>>>>
>>>>>>>> Signed-off-by: Thomas Richard <thomas.richard@xxxxxxxxxxx>
>>>>>>>> ---
>>>>>>>> drivers/mux/core.c | 29 +++++++++++++++++++++++++++++
>>>>>>>> include/linux/mux/driver.h | 1 +
>>>>>>>> 2 files changed, 30 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/mux/core.c b/drivers/mux/core.c
>>>>>>>> index 78c0022697ec..0858cacae845 100644
>>>>>>>> --- a/drivers/mux/core.c
>>>>>>>> +++ b/drivers/mux/core.c
>>>>>>>> @@ -215,6 +215,35 @@ void mux_chip_free(struct mux_chip *mux_chip)
>>>>>>>> }
>>>>>>>> EXPORT_SYMBOL_GPL(mux_chip_free);
>>>>>>>>
>>>>>>>> +/**
>>>>>>>> + * mux_chip_resume() - restores the mux-chip state
>>>>>>>> + * @mux_chip: The mux-chip to resume.
>>>>>>>> + *
>>>>>>>> + * Restores the mux-chip state.
>>>>>>>> + *
>>>>>>>> + * Return: Zero on success or a negative errno on error.
>>>>>>>> + */
>>>>>>>> +int mux_chip_resume(struct mux_chip *mux_chip)
>>>>>>>> +{
>>>>>>>> + int ret, i;
>>>>>>>> +
>>>>>>>> + for (i = 0; i < mux_chip->controllers; ++i) {
>>>>>>>> + struct mux_control *mux = &mux_chip->mux[i];
>>>>>>>> +
>>>>>>>> + if (mux->cached_state == MUX_CACHE_UNKNOWN)
>>>>>>>> + continue;
>>>>>>>> +
>>>>>>>> + ret = mux_control_set(mux, mux->cached_state);
>>>>>>>
>>>>>>> mux_control_set() is an internal helper. It is called from
>>>>>>> __mux_control_select() and mux_control_deselect() (and on init...)
>>>>>>>
>>>>>>> In all those cases, there is no race to reach the mux_control_set()
>>>>>>> function, by means of the mux->lock semaphore (or the mux not being
>>>>>>> "published" yet).
>>>>>>>
>>>>>>> I fail to see how resume is safe when mux->lock is ignored?
>>>>>>
>>>>>> I think I should use mux_control_select() to use the lock.
>>>>>> If I ignore the lock, I could have a cache coherence issue.
>>>>>>
>>>>>> I'll send a new version which use mux_control_select().
>>>>>> But if I use mux_control_select(), I have to clean the cache before to
>>>>>> call it, if not nothing happen [1].
>>>>>>
>>>>>> [1]
>>>>>> https://elixir.bootlin.com/linux/v6.11-rc6/source/drivers/mux/core.c#L319
>>>>>
>>>>> No, calling mux_control_select() in resume context is not an
>>>>> option IIUC. That would dead-lock if there is a long-time client
>>>>> who has locked the mux in some desired state.
>>>>
>>>> Yes, I didn't thought about it.
>>>>
>>>>>
>>>>> I see no trivial solution to integrate suspend/resume, and do
>>>>> not have enough time to think about what a working solutions
>>>>> would look like. Sorry.
>>>>>
>>>>
>>>> We maybe have a solution.
>>>> Please let me know if it's relevant or not for you:
>>>>
>>>> - Add a get operation in struct mux_control_ops.
>>>>
>>>> - Implement mux_chip_suspend() which reads the state of each mux using
>>>> the get operation, and store it in a hardware_state variable (stored in
>>>> the mux_control struct).
>>>>
>>>> - The mux_chip_resume uses the hardware_state value to restore all muxes
>>>> using mux_control_set().
>>>> So if a mux is locked with a long delay, there is no dead-lock.
>>>>
>>>> - If the get operation is not defined, mux_chip_suspend() and
>>>> mux_chip_resume() do nothing (maybe a warning or info message could be
>>>> useful).
>>>
>>> What if a mux control is used to mux e.g. an SPI bus, and on that bus
>>> sits some device that needs to be accessed during suspend/resume. What
>>> part of the system ensures that the mux is supended late and resumed
>>> early? Other things probably also want to be suspended late and resumed
>>> early. But everything can be latest/earliest, there has to be some kind
>>> of order, and I'm not sure that ordering is guaranteed to "fit".
>>
>> I experimented that it's ordered correctly for each stage, using
>> dependencies defined in the DT (I guess).
>> Of course if we suspend at suspend stage and resume at resume stage
>> whereas an SPI access is done at suspend_late (for example), it doesn't
>> work.
>> We could suspend/resume at noirq stages. It's the last suspend stage,
>> and the first resume stage, so we are sure to save and restore the right
>> states.
>
> And what if the mux in turn sits on I2C? Is the ordering still guaranteed
> to be correct? I.e. I'm not really intrested in just one case, but in the
> general problem. I am resisting the attempt to add generic support for
> something that, AFAICT, has no clear general solution.
>
> Maybe you should simply implement resume locally in the driver itself and
> have it reprogram the register, perhaps still based on mux->cached_state,
> but "behind the back" of the mux core?

Ok, it's seems to be the best solution for now.
I'll send a patch.

Just a small comment, I think I should not use the cached_state.
I should implement a mux_mmio_get(), which is called during suspend, to
get the "real" state. Then use it during resume.
Because the cache is not coherent during is a very small period [1].

What do you think ?

[1]
https://elixir.bootlin.com/linux/v6.11-rc6/source/drivers/mux/core.c#L144

Thomas