[PATCH v2 00/10] drm/msm: probe deferral fixes

From: Johan Hovold
Date: Tue Sep 13 2022 - 04:59:04 EST


The MSM DRM driver is currently broken in multiple ways with respect to
probe deferral. Not only does the driver currently fail to probe again
after a late deferral, but due to a related use-after-free bug this also
triggers NULL-pointer dereferences.

These bugs are not new but have become critical with the release of
5.19 where probe is deferred in case the aux-bus EP panel driver has not
yet been loaded.

The underlying problem is lifetime issues due to careless use of
device-managed resources.

Specifically, device-managed resources allocated post component bind
must be tied to the lifetime of the aggregate DRM device or they will
not necessarily be released when binding of the aggregate device is
deferred.

The following call chain and pseudo code serves as an illustration of
the problem:

- platform_probe(pdev1)
- dp_display_probe()
- component_add()

- platform_probe(pdev2) // last component
- dp_display_probe() // d0
- component_add()
- try_to_bring_up_aggregate_device()
- devres_open_group(adev->parent) // d1

- msm_drm_bind()
- msm_drm_init()
- component_bind_all()
- for_each_component()
- component_bind()
- devres_open_group(&pdev->dev) // d2
- dp_display_bind()
- devm_kzalloc(&pdev->dev) // a1, OK
- devres_close_group(&pdev->dev) // d3

- dpu_kms_hw_init()
- for_each_panel()
- msm_dp_modeset_init()
- dp_display_request_irq()
- devm_request_irq(&pdev->dev) // a2, BUG
- if (pdev == pdev2 && condition)
- return -EPROBE_DEFER;

- if (error)
- component_unbind_all()
- for_each_component()
- component_unbind()
- dp_display_unbind()
- devres_release_group(&pdev->dev) // d4, only a1 is freed

- if (error)
- devres_release_group(adev->parent) // d5

The device-managed allocation a2 is buggy as its lifetime is tied to the
component platform device and will not be released when the aggregate
device bind fails (e.g. due to a probe deferral).

When pdev2 is later probed again, the attempt to allocate the IRQ a
second time will fail for pdev1 (which is still bound to its platform
driver).

This series fixes the lifetime issues by tying the lifetime of a2 (and
similar allocations) to the lifetime of the aggregate device so that a2
is released at d5.

In some cases, such has for the DP IRQ, the above situation can also be
avoided by moving the allocation in question to the platform driver
probe (d0) or component bind (between d2 and d3). But as doing so is not
a general fix, this can be done later as a cleanup/optimisation.

Johan

Changes in v2
- use a custom devres action instead of amending the AUX bus interface
(Doug)
- split sanity check fixes and cleanups per bridge type (Dmitry)
- add another Fixes tag for the missing bridge counter reset (Dmitry)


Johan Hovold (10):
drm/msm: fix use-after-free on probe deferral
drm/msm/dp: fix memory corruption with too many bridges
drm/msm/dsi: fix memory corruption with too many bridges
drm/msm/hdmi: fix memory corruption with too many bridges
drm/msm/dp: fix IRQ lifetime
drm/msm/dp: fix aux-bus EP lifetime
drm/msm/dp: fix bridge lifetime
drm/msm/hdmi: fix IRQ lifetime
drm/msm/dp: drop modeset sanity checks
drm/msm/dsi: drop modeset sanity checks

drivers/gpu/drm/msm/dp/dp_display.c | 26 +++++++++++++++++++-------
drivers/gpu/drm/msm/dp/dp_parser.c | 6 +++---
drivers/gpu/drm/msm/dp/dp_parser.h | 5 +++--
drivers/gpu/drm/msm/dsi/dsi.c | 9 +++++----
drivers/gpu/drm/msm/hdmi/hdmi.c | 7 ++++++-
drivers/gpu/drm/msm/msm_drv.c | 1 +
6 files changed, 37 insertions(+), 17 deletions(-)

--
2.35.1