Re: [PATCH 1/2] mux: add mux_chip_resume() function

From: Peter Rosin
Date: Thu Sep 05 2024 - 04:29:15 EST


Hi!

2024-09-04 at 18:07, Thomas Richard wrote:
> On 9/4/24 14:52, Peter Rosin wrote:
>> Hi!
>>
>> 2024-09-04 at 13:29, Thomas Richard wrote:
>>> On 9/4/24 11:32, Peter Rosin wrote:
>>>> Hi!
>>>>
>>>> 2024-09-04 at 11:18, Thomas Richard wrote:
>>>>> On 9/3/24 15:22, Peter Rosin wrote:
>>>>>> Hi!
>>>>>>
>>>>>> Sorry for being unresponsive. And for first writing this in the older v4
>>>>>> thread instead of here.
>>>>>>
>>>>>> 2024-06-13 at 15:07, Thomas Richard wrote:
>>>>>>> The mux_chip_resume() function restores a mux_chip using the cached state
>>>>>>> of each mux.
>>>>>>>
>>>>>>> Signed-off-by: Thomas Richard <thomas.richard@xxxxxxxxxxx>
>>>>>>> ---
>>>>>>> drivers/mux/core.c | 29 +++++++++++++++++++++++++++++
>>>>>>> include/linux/mux/driver.h | 1 +
>>>>>>> 2 files changed, 30 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/mux/core.c b/drivers/mux/core.c
>>>>>>> index 78c0022697ec..0858cacae845 100644
>>>>>>> --- a/drivers/mux/core.c
>>>>>>> +++ b/drivers/mux/core.c
>>>>>>> @@ -215,6 +215,35 @@ void mux_chip_free(struct mux_chip *mux_chip)
>>>>>>> }
>>>>>>> EXPORT_SYMBOL_GPL(mux_chip_free);
>>>>>>>
>>>>>>> +/**
>>>>>>> + * mux_chip_resume() - restores the mux-chip state
>>>>>>> + * @mux_chip: The mux-chip to resume.
>>>>>>> + *
>>>>>>> + * Restores the mux-chip state.
>>>>>>> + *
>>>>>>> + * Return: Zero on success or a negative errno on error.
>>>>>>> + */
>>>>>>> +int mux_chip_resume(struct mux_chip *mux_chip)
>>>>>>> +{
>>>>>>> + int ret, i;
>>>>>>> +
>>>>>>> + for (i = 0; i < mux_chip->controllers; ++i) {
>>>>>>> + struct mux_control *mux = &mux_chip->mux[i];
>>>>>>> +
>>>>>>> + if (mux->cached_state == MUX_CACHE_UNKNOWN)
>>>>>>> + continue;
>>>>>>> +
>>>>>>> + ret = mux_control_set(mux, mux->cached_state);
>>>>>>
>>>>>> mux_control_set() is an internal helper. It is called from
>>>>>> __mux_control_select() and mux_control_deselect() (and on init...)
>>>>>>
>>>>>> In all those cases, there is no race to reach the mux_control_set()
>>>>>> function, by means of the mux->lock semaphore (or the mux not being
>>>>>> "published" yet).
>>>>>>
>>>>>> I fail to see how resume is safe when mux->lock is ignored?
>>>>>
>>>>> I think I should use mux_control_select() to use the lock.
>>>>> If I ignore the lock, I could have a cache coherence issue.
>>>>>
>>>>> I'll send a new version which use mux_control_select().
>>>>> But if I use mux_control_select(), I have to clean the cache before to
>>>>> call it, if not nothing happen [1].
>>>>>
>>>>> [1]
>>>>> https://elixir.bootlin.com/linux/v6.11-rc6/source/drivers/mux/core.c#L319
>>>>
>>>> No, calling mux_control_select() in resume context is not an
>>>> option IIUC. That would dead-lock if there is a long-time client
>>>> who has locked the mux in some desired state.
>>>
>>> Yes, I didn't thought about it.
>>>
>>>>
>>>> I see no trivial solution to integrate suspend/resume, and do
>>>> not have enough time to think about what a working solutions
>>>> would look like. Sorry.
>>>>
>>>
>>> We maybe have a solution.
>>> Please let me know if it's relevant or not for you:
>>>
>>> - Add a get operation in struct mux_control_ops.
>>>
>>> - Implement mux_chip_suspend() which reads the state of each mux using
>>> the get operation, and store it in a hardware_state variable (stored in
>>> the mux_control struct).
>>>
>>> - The mux_chip_resume uses the hardware_state value to restore all muxes
>>> using mux_control_set().
>>> So if a mux is locked with a long delay, there is no dead-lock.
>>>
>>> - If the get operation is not defined, mux_chip_suspend() and
>>> mux_chip_resume() do nothing (maybe a warning or info message could be
>>> useful).
>>
>> What if a mux control is used to mux e.g. an SPI bus, and on that bus
>> sits some device that needs to be accessed during suspend/resume. What
>> part of the system ensures that the mux is supended late and resumed
>> early? Other things probably also want to be suspended late and resumed
>> early. But everything can be latest/earliest, there has to be some kind
>> of order, and I'm not sure that ordering is guaranteed to "fit".
>
> I experimented that it's ordered correctly for each stage, using
> dependencies defined in the DT (I guess).
> Of course if we suspend at suspend stage and resume at resume stage
> whereas an SPI access is done at suspend_late (for example), it doesn't
> work.
> We could suspend/resume at noirq stages. It's the last suspend stage,
> and the first resume stage, so we are sure to save and restore the right
> states.

And what if the mux in turn sits on I2C? Is the ordering still guaranteed
to be correct? I.e. I'm not really intrested in just one case, but in the
general problem. I am resisting the attempt to add generic support for
something that, AFAICT, has no clear general solution.

Maybe you should simply implement resume locally in the driver itself and
have it reprogram the register, perhaps still based on mux->cached_state,
but "behind the back" of the mux core?

Cheers,
Peter