Re: [PATCH] soundwire: bus: conditionally recheck UNATTACHED status

From: Pierre-Louis Bossart
Date: Mon Sep 12 2022 - 08:01:34 EST


[top posting]
I reverted this patch in the SOF tree to use Richard Fitzgerald's
series, see

https://github.com/thesofproject/linux/pull/3836

I don't think we want this patch upstream, do we?


On 8/30/22 09:42, Bard Liao wrote:
> From: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>
>
> In configurations with two amplifiers on the same link, the Intel CI
> reports occasional enumeration/initialization timeouts during
> suspend-resume stress tests, where one of the two amplifiers becomes
> UNATTACHED immediately after being enumerated. This problem was
> reported both with Maxim and Realtek codecs, which pointed initially
> to a problem with status handling on the Intel side.
>
> The Cadence IP integrated on Intel platforms throws an interrupt when
> the status changes, and the information is kept with sticky bits until
> cleared. We initially added more checks to make sure the edge
> detection did not miss any transition, but that did not improve the
> results significantly.
>
> With the recent addition of the read_ping_status() callback, we were
> able to show that the status in sticky bits is shown as UNATTACHED
> even though the PING frames show the problematic device as
> ATTACHED. That completely breaks the entire logic where we assumed
> that a peripheral would always re-attach as device0. The resume
> timeouts make sense in that in those cases, the
> enumeration/initialization never happens a second time.
>
> One possible explanation is that this problem typically happens when a
> bus clash is reported, so it could very well be that the detection is
> fooled by a transient electrical issue or conflict between two
> peripherals.
>
> This patch conditionally double-checks the status reported in the
> sticky bits with the actual PING frame status. If the peripheral
> reports as attached in PING frames, the early detection based on
> sticky bits is discarded.
>
> Note that this patch only corrects issues of false positives on the
> manager side.
>
> If the peripheral lost and regain sync, then it would report as
> attached on Device0. A peripheral that would not reset its dev_num
> would not be compliant with the MIPI specification.
>
> BugLink: https://github.com/thesofproject/linux/issues/3638
> BugLink: https://github.com/thesofproject/linux/issues/3325
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>
> Reviewed-by: Rander Wang <rander.wang@xxxxxxxxx>
> Signed-off-by: Bard Liao <yung-chuan.liao@xxxxxxxxxxxxxxx>
> ---
> Hi Vinod,
>
> You will need the "ASoC/soundwire: log actual PING status on resume issues"
> series which is applied at ASoC tree before appling this patch.
>
> ---
> drivers/soundwire/bus.c | 19 +++++++++++++++++++
> drivers/soundwire/intel.c | 1 +
> include/linux/soundwire/sdw.h | 3 +++
> 3 files changed, 23 insertions(+)
>
> diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
> index 2772973eebb1..d0d486f07673 100644
> --- a/drivers/soundwire/bus.c
> +++ b/drivers/soundwire/bus.c
> @@ -1767,6 +1767,25 @@ int sdw_handle_slave_status(struct sdw_bus *bus,
> slave->status != SDW_SLAVE_UNATTACHED) {
> dev_warn(&slave->dev, "Slave %d state check1: UNATTACHED, status was %d\n",
> i, slave->status);
> +
> + if (bus->recheck_unattached && bus->ops->read_ping_status) {
> + u32 ping_status;
> +
> + mutex_lock(&bus->msg_lock);
> +
> + ping_status = bus->ops->read_ping_status(bus);
> +
> + mutex_unlock(&bus->msg_lock);
> +
> + ping_status >>= (i * 2);
> + ping_status &= 0x3;
> +
> + if (ping_status != 0) {
> + dev_warn(&slave->dev, "Slave %d state in PING frame is %d, ignoring earlier detection\n",
> + i, ping_status);
> + continue;
> + }
> + }
> sdw_modify_slave_status(slave, SDW_SLAVE_UNATTACHED);
> }
> }
> diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
> index 25ec9c272239..0c6e674dbf85 100644
> --- a/drivers/soundwire/intel.c
> +++ b/drivers/soundwire/intel.c
> @@ -1311,6 +1311,7 @@ static int intel_link_probe(struct auxiliary_device *auxdev,
>
> bus->link_id = auxdev->id;
> bus->dev_num_ida_min = INTEL_DEV_NUM_IDA_MIN;
> + bus->recheck_unattached = true;
>
> sdw_cdns_probe(cdns);
>
> diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
> index a2b31d25ea27..51ac71984260 100644
> --- a/include/linux/soundwire/sdw.h
> +++ b/include/linux/soundwire/sdw.h
> @@ -892,6 +892,8 @@ struct sdw_master_ops {
> * @dev_num_ida_min: if set, defines the minimum values for the IDA
> * used to allocate system-unique device numbers. This value needs to be
> * identical across all SoundWire bus in the system.
> + * @recheck_unattached: if set, double-check UNATTACHED status changes
> + * by reading PING frame status.
> */
> struct sdw_bus {
> struct device *dev;
> @@ -917,6 +919,7 @@ struct sdw_bus {
> bool multi_link;
> int hw_sync_min_links;
> int dev_num_ida_min;
> + bool recheck_unattached;
> };
>
> int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent,