Re: [PATCH] Revert "firmware: qcom: qseecom: convert to using the TZ allocator"

From: Bartosz Golaszewski
Date: Mon Jul 29 2024 - 08:36:10 EST


On Mon, Jul 29, 2024 at 12:28 PM Johan Hovold <johan@xxxxxxxxxx> wrote:
>
> On Mon, Jul 29, 2024 at 12:03:55PM +0200, Bartosz Golaszewski wrote:
> > On Mon, Jul 29, 2024 at 11:58 AM Johan Hovold <johan+linaro@xxxxxxxxxx> wrote:
> > >
> > > This reverts commit 6612103ec35af6058bb85ab24dae28e119b3c055.
> > >
> > > Using the "TZ allocator" for qcseecom breaks efivars on machines like
> > > the Lenovo ThinkPad X13s and x1e80100 CRD:
> > >
> > > qcom_scm firmware:scm: qseecom: scm call failed with error -22
> > >
> > > Reverting to the 6.10 state makes qseecom work again.
> > >
> > > Fixes: 6612103ec35a ("firmware: qcom: qseecom: convert to using the TZ allocator")
> > > Cc: Bartosz Golaszewski <bartosz.golaszewski@xxxxxxxxxx>
> > > Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx>
> > > ---
> > > Cc: regressions@xxxxxxxxxxxxxxx
> > >
> > > #regzbot introduced: 6612103ec35a
> >
> > How about at least giving me the chance to react to the report and fix
> > it instead of reverting it right away?
>
> Lots of folks have been running linux-next on Qualcomm machines for a
> month without reporting or fixing the issue. And v10 of the offending
> patch was apparently never even tested before being merged.
>
> I'm sure you'll have a few days to look at this before we revert.
>
> I'll be on holiday for a few weeks, but you have an X13s so you should
> be able to reproduce this yourself.
>
> > Are there any other messages about SHM bridge/SCM calls in the kernel log?
>
> I've also seen this combo:
>
> [ 3.219296] qcom_scm firmware:scm: qseecom: scm call failed with error -22
> [ 3.227153] efivars: get_next_variable: status=8000000000000007
>
> But usually the first message is the only hint why efivars is completely
> broken.
>
> > Do you have QCOM_TZMEM_MODE_GENERIC=y or QCOM_TZMEM_MODE_SHM_BRIDGE=y
> > in your config? If the latter: can you try changing it to the former
> > and retest?
>
> I have the former in my config but have tested both, made no difference.
>
> > > It's a little frustrating to find that no-one tested this properly or
> > > even noticed the regression for the past month that this has been
> > > sitting in linux-next.
> >
> > I have tested many platforms and others have done the same but
> > unfortunately cannot possibly test every single use-case on every
> > platform. This is what next is for after all.
>
> I doubt this is specific to sc8280xp and x1e80100. Which platforms did
> you test qseecom and efivars on?
>
> > > Looks like Maximilian may have hit this with v9 too:
> > >
> > > https://lore.kernel.org/lkml/CAMRc=Mf_pvrh2VMfTVE-ZTypyO010p=to-cd8Q745DzSDXLGFw@xxxxxxxxxxxxxx/
> > >
> > > even if there were further issues with that revision.
> >
> > This is a different issue that was fixed in a later iteration.
>
> The symptoms appear to be the same once you get past the locking splats:
>
> [ 2.507347] qcom_scm firmware:scm: qseecom: scm call failed with error -22
> [ 2.507813] efivars: get_next_variable: status=8000000000000007
>
> So it's possible that this never worked.
>
> Johan

How do you reproduce this on x1e?

Bart