Re: 5.14-rc failure to resume
From: Linus Torvalds
Date: Sat Jul 24 2021 - 16:48:36 EST
On Sat, Jul 24, 2021 at 12:44 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> This does appear to be the culprit. With it reverted on top of current
> master (and with the block and io_uring changes pulled in too), the
> kernel survives many resumes without issue.
That commit seems fundamentally buggy.
It makes "acpi_dev_get_next_match_dev()" always do
acpi_dev_put(adev);
to put the previous device, but "adev" is perfectly valid as NULL, and
acpi_dev_get_next_match_dev() even tests for it:
struct device *start = adev ? &adev->dev : NULL;
so it can - and will - do
acpi_dev_put(NULL);
which does
put_device(&adev->dev);
and passes in an invalid pointer to put_device().
And yes, that adev very much can be NULL, with drivers/acpi/utils.c
even passing it in explicitly:
struct acpi_device *
acpi_dev_get_first_match_dev(const char *hid, const char *uid, s64 hrv)
{
return acpi_dev_get_next_match_dev(NULL, hid, uid, hrv);
}
EXPORT_SYMBOL(acpi_dev_get_first_match_dev);
Am I missing something? How does that code work at all for anybody?
I probably _am_ missing something.
Linus