The MSM DRM is currently broken in multiple ways with respect to probe
deferral. Not only does the driver currently fail to probe again after a
late deferral, but due to a related use-after-free bug this also
triggers NULL-pointer dereferences.
These bugs are not new but have become critical with the release of
5.19 where probe is deferred in case the aux-bus EP panel driver has not
yet been loaded.
The underlying problem is lifetime issues due to careless use of
device-managed resources.
Specifically, device-managed resources allocated post component bind
must be tied to the lifetime of the aggregate DRM device or they will
not necessarily be released when binding of the aggregate device is
deferred.
The following call chain and pseudo code serves as an illustration of
the problem:
- platform_probe(pdev1)
- dp_display_probe()
- component_add()
- platform_probe(pdev2) // last component
- dp_display_probe() // d0
- component_add()
- try_to_bring_up_aggregate_device()
- devres_open_group(adev->parent) // d1
- msm_drm_bind()
- msm_drm_init()
- component_bind_all()
- for_each_component()
- component_bind()
- devres_open_group(&pdev->dev) // d2
- dp_display_bind()
- devm_kzalloc(&pdev->dev) // a1, OK
- devres_close_group(&pdev->dev) // d3
- dpu_kms_hw_init()
- for_each_panel()
- msm_dp_modeset_init()
- dp_display_request_irq()
- devm_request_irq(&pdev->dev) // a2, BUG
- if (pdev == pdev2 && condition)
- return -EPROBE_DEFER;
- if (error)
- component_unbind_all()
- for_each_component()
- component_unbind()
- dp_display_unbind()
- devres_release_group(&pdev->dev) // d4, only a1 is freed
- if (error)
- devres_release_group(adev->parent) // d5
The device-managed allocation a2 is buggy as its lifetime is tied to the
component platform device and will not be released when the aggregate
device bind fails (e.g. due to a probe deferral).
When pdev2 is later probed again, the attempt to allocate the IRQ a
second time will fail for pdev1 (which is still bound to its platform
driver).
This series fixes the lifetime issues by tying the lifetime of a2 (and
similar allocations) to the lifetime of the aggregate device so that a2
is released at d5.
In some cases, such has for the DP IRQ, the above situation can also be
avoided by moving the allocation in question to the platform driver
probe (d0) or component bind (between d2 and d3). But as doing so is not
a general fix, this can be done later as a cleanup/optimisation.
Johan
Johan Hovold (7):
drm/msm: fix use-after-free on probe deferral
drm/msm: fix memory corruption with too many bridges
drm/msm/dp: fix IRQ lifetime
drm/msm/dp: fix aux-bus EP lifetime
drm/msm/dp: fix bridge lifetime
drm/msm/hdmi: fix IRQ lifetime
drm/msm: drop modeset sanity checks
drivers/gpu/drm/bridge/parade-ps8640.c | 2 +-
drivers/gpu/drm/display/drm_dp_aux_bus.c | 5 +++--
drivers/gpu/drm/msm/dp/dp_display.c | 16 +++++++++-------
drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
drivers/gpu/drm/msm/dsi/dsi.c | 9 +++++----
drivers/gpu/drm/msm/hdmi/hdmi.c | 7 ++++++-
drivers/gpu/drm/msm/msm_drv.c | 1 +
include/drm/display/drm_dp_aux_bus.h | 6 +++---
9 files changed, 34 insertions(+), 23 deletions(-)