Re: [REGRESSION 5.19] NULL dereference by ucsi_acpi driver

From: Thorsten Leemhuis
Date: Wed Aug 24 2022 - 05:51:44 EST


[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 19.08.22 18:32, Takashi Iwai wrote:
> Hi,
>
> we've got multiple reports about 5.19 kernel starting crashing after
> some time, and this turned out to be triggered by ucsi_acpi driver.
> The details are found in:
> https://bugzilla.suse.com/show_bug.cgi?id=1202386
>
> The culprit seems to be the commit 87d0e2f41b8c
> usb: typec: ucsi: add a common function ucsi_unregister_connectors()
>
> This commit looks as if it were a harmless cleanup, but this failed in
> a subtle way. Namely, in the error scenario, the driver gets an error
> at ucsi_register_altmodes(), and goes to the error handling to release
> the resources. Through this refactoring, the release part was unified
> to a funciton ucsi_unregister_connectors(). And there, it has a NULL
> check of con->wq, and it bails out the loop if it's NULL.
> Meanwhile, ucsi_register_port() itself still calls destroy_workqueue()
> and clear con->wq at its error path. This ended up in the leftover
> power supply device with the uninitialized / cleared device.
>
> It was confirmed that the problem could be avoided by a simple
> revert.
>
> I guess another fix could be removing the part clearing con->wq, i.e.
>
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -1192,11 +1192,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
> out_unlock:
> mutex_unlock(&con->lock);
>
> - if (ret && con->wq) {
> - destroy_workqueue(con->wq);
> - con->wq = NULL;
> - }
> -
> return ret;
> }
>
>
> ... but it's totally untested and I'm not entirely sure whether it's
> better.

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot introduced 87d0e2f41b8c ^
https://bugzilla.suse.com/show_bug.cgi?id=1202386
#regzbot title NULL dereference by ucsi_acpi driver
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.