Re: [PATCH v3 13/13] coresight: Fix CTI module refcount leak by making it a helper device

From: Suzuki K Poulose
Date: Tue Apr 04 2023 - 10:01:59 EST


On 04/04/2023 14:04, James Clark wrote:


On 04/04/2023 13:55, James Clark wrote:


On 04/04/2023 10:21, Suzuki K Poulose wrote:
On 29/03/2023 12:53, James Clark wrote:
The CTI module has some hard coded refcounting code that has a leak.
For example running perf and then trying to unload it fails:

   perf record -e cs_etm// -a -- ls
   rmmod coresight_cti

   rmmod: ERROR: Module coresight_cti is in use

The coresight core already handles references of devices in use, so by
making CTI a normal helper device, we get working refcounting for free.

Signed-off-by: James Clark <james.clark@xxxxxxx>
---
  drivers/hwtracing/coresight/coresight-core.c  | 99 ++++++-------------
  .../hwtracing/coresight/coresight-cti-core.c  | 52 +++++-----
  .../hwtracing/coresight/coresight-cti-sysfs.c |  4 +-
  drivers/hwtracing/coresight/coresight-cti.h   |  4 +-
  drivers/hwtracing/coresight/coresight-priv.h  |  4 +-
  drivers/hwtracing/coresight/coresight-sysfs.c |  4 +
  include/linux/coresight.h                     | 30 +-----
  7 files changed, 70 insertions(+), 127 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-core.c
b/drivers/hwtracing/coresight/coresight-core.c
index 65f5bd8516d8..458d91b4e23f 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -254,60 +254,39 @@ void coresight_disclaim_device(struct
coresight_device *csdev)
  }
  EXPORT_SYMBOL_GPL(coresight_disclaim_device);
  -/* enable or disable an associated CTI device of the supplied CS
device */
-static int
-coresight_control_assoc_ectdev(struct coresight_device *csdev, bool
enable)
-{
-    int ect_ret = 0;
-    struct coresight_device *ect_csdev = csdev->ect_dev;
-    struct module *mod;
-
-    if (!ect_csdev)
-        return 0;
-    if ((!ect_ops(ect_csdev)->enable) || (!ect_ops(ect_csdev)->disable))
-        return 0;
-
-    mod = ect_csdev->dev.parent->driver->owner;
-    if (enable) {
-        if (try_module_get(mod)) {
-            ect_ret = ect_ops(ect_csdev)->enable(ect_csdev);
-            if (ect_ret) {
-                module_put(mod);
-            } else {
-                get_device(ect_csdev->dev.parent);
-                csdev->ect_enabled = true;
-            }
-        } else
-            ect_ret = -ENODEV;
-    } else {
-        if (csdev->ect_enabled) {
-            ect_ret = ect_ops(ect_csdev)->disable(ect_csdev);
-            put_device(ect_csdev->dev.parent);
-            module_put(mod);
-            csdev->ect_enabled = false;
-        }
-    }
-
-    /* output warning if ECT enable is preventing trace operation */
-    if (ect_ret)
-        dev_info(&csdev->dev, "Associated ECT device (%s) %s failed\n",
-             dev_name(&ect_csdev->dev),
-             enable ? "enable" : "disable");
-    return ect_ret;
-}
-
  /*
- * Set the associated ect / cti device while holding the coresight_mutex
+ * Add a helper as an output device while holding the coresight_mutex
   * to avoid a race with coresight_enable that may try to use this
value.
   */
-void coresight_set_assoc_ectdev_mutex(struct coresight_device *csdev,
-                      struct coresight_device *ect_csdev)
+void coresight_add_helper_mutex(struct coresight_device *csdev,
+                struct coresight_device *helper)

minor nit: It may be a good idea to rename this, in line with the
kernel naming convention :

    coresight_add_helper_unlocked()

Or if this is the only variant, it is OK to leave it as :
    coresight_add_helper()
with a big fat comment in the function description to indicate
that it takes the mutex and may be even add a :

There is already a bit of a comment in the description but I can expand
on it more.

might_sleep() and lockdep_assert_not_held(&coresight_mutex);

in the function.


I'm not sure if lockdep_assert_not_held() would be right because
sometimes it could be held if another device is being created at the
same time? Or something like a session is started at the same time a CTI
device is added.


Oh I see it's not for any task, it's just for the current one. That
makes sense then I can add it.

Although it looks like it only warns when lockdep is enabled, but don't
you get a warning anyway if you try to take the lock twice with lockdep
enabled?

Thats true, you could ignore the lockdep check.

So I'm not sure why we would add lockdep_assert_not_held() here
and not on all the mutex_lock() calls?\

Ah. I double checked this and the coresight_mutex is static and local to
coresight-core.c. So there is no point in talking about locking for
external users. So I would just leave out any suffixes and simply use
the lockdep check implicit from mutex_lock().

Suzuki