Re: [PATCH drivers/perf: hisi:] drivers/perf: hisi: fix NULL pointer issue when uninstall hns3 pmu driver
From: Yicong Yang
Date: Thu Oct 12 2023 - 08:39:01 EST
Hi Robin,
On 2023/10/12 0:03, Robin Murphy wrote:
> On 11/10/2023 9:37 am, Jijie Shao wrote:
>>
>> on 2023/10/10 17:32, Yicong Yang wrote:
>>> Hi Jijie,
>>>
>>> On 2023/10/9 18:50, Jijie Shao wrote:
>>>> From: Hao Chen <chenhao418@xxxxxxxxxx>
>>>>
>>>> When uninstall hns3 pmu driver, it will call cpuhp_state_remove_instance()
>>>> and then callback function hns3_pmu_offline_cpu() is called, it may cause
>>>> NULL pointer call trace when other driver is installing or uninstalling
>>>> concurrently.
>>>>
>>> More information about the calltrace you've met and how to reproduce this?
>>> I'm not sure why other drivers are involved.
>>>
>>>> As John Garry's opinion, cpuhp_state_remove_instance() is used for shared
>>>> interrupt, and using cpuhp_state_remove_instance_nocalls() is fine for PCIe
>>>> or HNS3 pmu.
>>>>
>>> I'm a bit confused here. We need to update the using CPU and migrate the perf
>>> context as well as the interrupt affinity in cpuhp::teardown() callback, so
>>> it make sense to not call this on driver detachment. But I cannot figure
>>> out why this is related to the shared interrupt, more details?
>>>
>> ok,I will send v2 to add more details.
>
> This shouldn't have anything to do with concurrency or shared interrupts or anything else. It's simply that we should clearly not attempt to migrate a PMU context (via invoking the hotplug callbacks) *after* the relevant PMU has already been unregistered, since that's liable to lead to some kind of use-after-free, and at best it's just a pointless waste of time anyway - if we've got to the point of unbinding the driver (or failing to probe at all), there should definitely not be any active events or other PMU state that needs updating.
>
Thanks for the clarification. I think this is the root reason
of the problem met on the hns3 pmu.
Thanks.