Re: acpi_get_devices() crash when acpi_disabled==true (was [PATCH v2] drm/privacy-screen: honor acpi=off in detect_thinkpad_privacy_screen)

From: Hans de Goede
Date: Thu Jan 27 2022 - 08:39:53 EST


Hi,

On 1/27/22 14:33, Rafael J. Wysocki wrote:
> On Thu, Jan 27, 2022 at 2:05 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>
>> Hi,
>>
>> On 1/26/22 18:11, Rafael J. Wysocki wrote:
>>> On Wed, Jan 26, 2022 at 5:41 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 1/26/22 16:54, Rafael J. Wysocki wrote:
>>>>> On Wed, Jan 26, 2022 at 2:47 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> On 1/23/22 10:10, Tong Zhang wrote:
>>>>>>> when acpi=off is provided in bootarg, kernel crash with
>>>>>>>
>>>>>>> [ 1.252739] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>>>>>> [ 1.258308] Call Trace:
>>>>>>> [ 1.258490] ? acpi_walk_namespace+0x147/0x147
>>>>>>> [ 1.258770] acpi_get_devices+0xe4/0x137
>>>>>>> [ 1.258921] ? drm_core_init+0xc0/0xc0 [drm]
>>>>>>> [ 1.259108] detect_thinkpad_privacy_screen+0x5e/0xa8 [drm]
>>>>>>> [ 1.259337] drm_privacy_screen_lookup_init+0xe/0xe85 [drm]
>>>>>>>
>>>>>>> The reason is that acpi_walk_namespace expects acpi related stuff
>>>>>>> initialized but in fact it wouldn't when acpi is set to off. In this case
>>>>>>> we should honor acpi=off in detect_thinkpad_privacy_screen().
>>>>>>>
>>>>>>> Signed-off-by: Tong Zhang <ztong0001@xxxxxxxxx>
>>>>>>
>>>>>> Thank you for catching this and thank you for your patch. I was about to merge
>>>>>> this, but then I realized that this might not be the best way to fix this.
>>>>>>
>>>>>> A quick grep shows 10 acpi_get_devices() calls outside of drivers/acpi,
>>>>>> and at a first glance about half of those are missing an acpi_disabled
>>>>>> check. IMHO it would be better to simply add an acpi_disabled check to
>>>>>> acpi_get_devices() itself.
>>>>>>
>>>>>> Rafael, do you agree ?
>>>>>
>>>>> Yes, I do.
>>>>
>>>> Did you see my follow-up that that is not going to work because
>>>> acpi_get_devices() is an acpica function ?
>>>
>>> No, I didn't, but it is possible to add a wrapper doing the check
>>> around it and convert all of the users.
>>
>> Yes I did think about that. Note that I've gone ahead and pushed
>> the fix which started this to drm-misc-fixes, to resolve the crash
>> for now.
>
> OK
>
>> If we add such a wrapper we can remove a bunch of acpi_disabled checks
>> from various callers.
>>
>>> Alternatively, the ACPICA function can check acpi_gbl_root_node
>>> against NULL, like in the attached (untested) patch.
>>
>> That is probably an even better idea, as that avoids the need
>> for a wrapper altogether. So I believe that that is the best
>> solution.
>
> Allright, let me cut an analogous patch for the upstream ACPICA, then.

Great, thank you.

I have added a note about checking for when this has found its way
into Linus' tree to my own TODO list, with the goal of doing
a cleanup series removing the then no longer needed acpi_disabled
checks in a bunch of places.

Regards,

Hans