[PATCH v2 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

From: Herve Codina
Date: Thu Feb 29 2024 - 03:41:52 EST


In the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()

During the step 1, devices are destroyed and devlinks are removed.
During the step 2, OF nodes are destroyed but
__of_changeset_entry_destroy() can raise warnings related to missing
of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2 ...

Indeed, during the devlink removals performed at step 1, the removal
itself releasing the device (and the attached of_node) is done by a job
queued in a workqueue and so, it is done asynchronously with respect to
function calls.
When the warning is present, of_node_put() will be called but wrongly
too late from the workqueue job.

In order to be sure that any ongoing devlink removals are done before
the of_node destruction, synchronize the of_overlay_remove() with the
devlink removals.

Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Herve Codina <herve.codina@xxxxxxxxxxx>
---
drivers/of/overlay.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index 2ae7e9d24a64..99659ae9fb28 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -853,6 +853,14 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
{
int i;

+ /*
+ * Wait for any ongoing device link removals before removing some of
+ * nodes. Drop the global lock while waiting
+ */
+ mutex_unlock(&of_mutex);
+ device_link_wait_removal();
+ mutex_lock(&of_mutex);
+
if (ovcs->cset.entries.next)
of_changeset_destroy(&ovcs->cset);

@@ -862,7 +870,6 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
ovcs->id = 0;
}

-
for (i = 0; i < ovcs->count; i++) {
of_node_put(ovcs->fragments[i].target);
of_node_put(ovcs->fragments[i].overlay);
--
2.43.0