Re: [PATCH] soc: qcom: pd-mapper: defer probing on sdm845
From: Steev Klimaszewski
Date: Thu Feb 06 2025 - 13:33:33 EST
Hi Frank,
On Thu, Feb 6, 2025 at 12:45 AM Frank Oltmanns <frank@xxxxxxxxxxxx> wrote:
>
> Hi again,
>
> On 2025-02-06 at 06:57:46 +0100, Frank Oltmanns <frank@xxxxxxxxxxxx> wrote:
> > On 2025-02-05 at 20:54:53 -0600, Bjorn Andersson <andersson@xxxxxxxxxx> wrote:
> >> On Wed, Feb 05, 2025 at 10:57:11PM +0100, Frank Oltmanns wrote:
> >>> On xiaomi-beryllium and oneplus-enchilada audio does not work reliably
> >>> with the in-kernel pd-mapper. Deferring the probe solves these issues.
> >>> Specifically, audio only works reliably with the in-kernel pd-mapper, if
> >>> the probe succeeds when remoteproc3 triggers the first successful probe.
> >>> I.e., probes from remoteproc0, 1, and 2 need to be deferred until
> >>> remoteproc3 has been probed.
> >>>
> >>> Introduce a device specific quirk that lists the first auxdev for which
> >>> the probe must be executed. Until then, defer probes from other auxdevs.
> >>>
> >>> Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
> >>> Cc: stable@xxxxxxxxxxxxxxx
> >>> Signed-off-by: Frank Oltmanns <frank@xxxxxxxxxxxx>
> >>> ---
> >>> The in-kernel pd-mapper has been causing audio issues on sdm845
> >>> devices (specifically, xiaomi-beryllium and oneplus-enchilada). I
> >>> observed that Stephan’s approach [1] - which defers module probing by
> >>> blocklisting the module and triggering a later probe - works reliably.
> >>>
> >>> Inspired by this, I experimented with delaying the probe within the
> >>> module itself by returning -EPROBE_DEFER in qcom_pdm_probe() until a
> >>> certain time (13.9 seconds after boot, based on ktime_get()) had
> >>> elapsed. This method also restored audio functionality.
> >>>
> >>> Further logging of auxdev->id in qcom_pdm_probe() led to an interesting
> >>> discovery: audio only works reliably with the in-kernel pd-mapper when
> >>> the first successful probe is triggered by remoteproc3. In other words,
> >>> probes from remoteproc0, 1, and 2 must be deferred until remoteproc3 has
> >>> been probed.
> >>>
> >>
> >> The remoteproc numbering is assigned at the time of registering each
> >> remoteproc driver, and does not necessarily relate to the order in which
> >> they are launched. That said, it sounds like what you're saying is that
> >> is that audio only works if we launch the pd-mapper after the
> >> remoteprocs has started?
> >
> > Almost, but not quite. You are right, that remoteproc3 in my setup is
> > always the last one that probes the pd-mapper.
> >
> > However, when experimenting with different timings I saw that the
> > pd-mapper really do has to respond to the probe from remoteproc3 (I'm
> > not sure I'm using the right terminology here, but I hope my intent
> > comes across). If the pd-mapper responds to remoteproc3's probe with
> > -EPROBE_DEFER there will again be subsequent probes from the other
> > remoteprocs. If we act on those probes, there is a chance that audio
> > (mic in my case) does not work. So, my conclusion was that remoteproc3's
> > probe has to be answered first before responding to the other probes.
> >
> > Further note that in my experiments remoteproc1 was always the first to
> > do the probing, followed by either remoteproc0 or remoteproc2.
> > remoteproc3 was always the last.
> >
> >> Can you please confirm which remoteproc is which in your numbering? (In
> >> particular, which remoteproc instance is the audio DSP?)
> >
> > remoteproc0: adsp
> > remoteproc1: cdsp
> > remoteproc2: slpi
> > remoteproc3: 4080000.remoteproc
>
> I'm sorry but there's one additional thing that I really should have
> mentioned earlier: My issue is specifically with *call* audio.
>
> Call audio is only available using out-of-tree patches. The ones I'm
> currently using are here:
> https://gitlab.com/sdm845-mainline/linux/-/commits/sdm845-6.13-rc2-r2?ref_type=tags
>
> Best regards,
> Frank
>
> >
> > (I took them from the kernel messages "remoteproc remoteproc<X>: <xyz>
> > is available".)
> >
> >>> To address this, I propose introducing a quirk table (which currently
> >>> only contains sdm845) to defer probing until the correct auxiliary
> >>> device (remoteproc3) initiates the probe.
> >>>
> >>> I look forward to your feedback.
> >>>
> >>
> >> I don't think the proposed workaround is our path forward, but I very
> >> much appreciate your initiative and the insights it provides!
> >
> > Thank you! I was hoping that somebody with more experience in the QCOM
> > universe can draw further conclusions from this.
> >
> >> Seems to
> >> me that we have a race-condition in the pdr helper.
> >
> > If you need further experimenting or can give me rough guidance on where
> > to look next, I'll be glad to help.
> >
> > Thanks again,
> > Frank
> >
> >>
> >> Regards,
> >> Bjorn
> >>
> >>> Thanks,
> >>> Frank
> >>>
> >>> [1]: https://lore.kernel.org/linux-arm-msm/Zwj3jDhc9fRoCCn6@xxxxxxxxxx/
> >>> ---
> >>> drivers/soc/qcom/qcom_pd_mapper.c | 43 +++++++++++++++++++++++++++++++++++++++
> >>> 1 file changed, 43 insertions(+)
> >>>
> >>> diff --git a/drivers/soc/qcom/qcom_pd_mapper.c b/drivers/soc/qcom/qcom_pd_mapper.c
> >>> index 154ca5beb47160cc404a46a27840818fe3187420..34b26df665a888ac4872f56e948e73b561ae3b6b 100644
> >>> --- a/drivers/soc/qcom/qcom_pd_mapper.c
> >>> +++ b/drivers/soc/qcom/qcom_pd_mapper.c
> >>> @@ -46,6 +46,11 @@ struct qcom_pdm_data {
> >>> struct list_head services;
> >>> };
> >>>
> >>> +struct qcom_pdm_probe_first_dev_quirk {
> >>> + const char *name;
> >>> + u32 id;
> >>> +};
> >>> +
> >>> static DEFINE_MUTEX(qcom_pdm_mutex); /* protects __qcom_pdm_data */
> >>> static struct qcom_pdm_data *__qcom_pdm_data;
> >>>
> >>> @@ -526,6 +531,11 @@ static const struct qcom_pdm_domain_data *x1e80100_domains[] = {
> >>> NULL,
> >>> };
> >>>
> >>> +static const struct qcom_pdm_probe_first_dev_quirk first_dev_remoteproc3 = {
> >>> + .id = 3,
> >>> + .name = "pd-mapper"
> >>> +};
> >>> +
> >>> static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
> >>> { .compatible = "qcom,apq8016", .data = NULL, },
> >>> { .compatible = "qcom,apq8064", .data = NULL, },
> >>> @@ -566,6 +576,10 @@ static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
> >>> {},
> >>> };
> >>>
> >>> +static const struct of_device_id qcom_pdm_defer[] __maybe_unused = {
> >>> + { .compatible = "qcom,sdm845", .data = &first_dev_remoteproc3, },
> >>> + {},
> >>> +};
> >>> static void qcom_pdm_stop(struct qcom_pdm_data *data)
> >>> {
> >>> qcom_pdm_free_domains(data);
> >>> @@ -637,6 +651,25 @@ static struct qcom_pdm_data *qcom_pdm_start(void)
> >>> return ERR_PTR(ret);
> >>> }
> >>>
> >>> +static bool qcom_pdm_ready(struct auxiliary_device *auxdev)
> >>> +{
> >>> + const struct of_device_id *match;
> >>> + struct device_node *root;
> >>> + struct qcom_pdm_probe_first_dev_quirk *first_dev;
> >>> +
> >>> + root = of_find_node_by_path("/");
> >>> + if (!root)
> >>> + return true;
> >>> +
> >>> + match = of_match_node(qcom_pdm_defer, root);
> >>> + of_node_put(root);
> >>> + if (!match)
> >>> + return true;
> >>> +
> >>> + first_dev = (struct qcom_pdm_probe_first_dev_quirk *) match->data;
> >>> + return (auxdev->id == first_dev->id) && !strcmp(auxdev->name, first_dev->name);
> >>> +}
> >>> +
> >>> static int qcom_pdm_probe(struct auxiliary_device *auxdev,
> >>> const struct auxiliary_device_id *id)
> >>>
> >>> @@ -647,6 +680,15 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
> >>> mutex_lock(&qcom_pdm_mutex);
> >>>
> >>> if (!__qcom_pdm_data) {
> >>> + if (!qcom_pdm_ready(auxdev)) {
> >>> + pr_debug("%s: Deferring probe for device %s (id: %u)\n",
> >>> + __func__, auxdev->name, auxdev->id);
> >>> + ret = -EPROBE_DEFER;
> >>> + goto probe_stop;
> >>> + }
> >>> + pr_debug("%s: Probing for device %s (id: %u), starting pdm\n",
> >>> + __func__, auxdev->name, auxdev->id);
> >>> +
> >>> data = qcom_pdm_start();
> >>>
> >>> if (IS_ERR(data))
> >>> @@ -659,6 +701,7 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
> >>>
> >>> auxiliary_set_drvdata(auxdev, __qcom_pdm_data);
> >>>
> >>> +probe_stop:
> >>> mutex_unlock(&qcom_pdm_mutex);
> >>>
> >>> return ret;
> >>>
> >>> ---
> >>> base-commit: 7f048b202333b967782a98aa21bb3354dc379bbf
> >>> change-id: 20250205-qcom_pdm_defer-3dc1271d74d9
> >>>
> >>> Best regards,
> >>> --
> >>> Frank Oltmanns <frank@xxxxxxxxxxxx>
> >>>
>
I know that Bjorn already said this change is a no, but I wanted to
mention that I have a Lenovo Yoga C630 (WOS) device here that is also
an sdm845, and with this patch applied, it has the opposite effect and
breaks audio on it.
-- steev