Re: [WIP PATCH 03/15] drm/dp_mst: Introduce new refcounting scheme for mstbs and ports
From: Lyude Paul
Date: Tue Dec 18 2018 - 16:28:08 EST
On Fri, 2018-12-14 at 10:29 +0100, Daniel Vetter wrote:
> On Thu, Dec 13, 2018 at 08:25:32PM -0500, Lyude Paul wrote:
> > The current way of handling refcounting in the DP MST helpers is really
> > confusing and probably just plain wrong because it's been hacked up many
> > times over the years without anyone actually going over the code and
> > seeing if things could be simplified.
> >
> > To the best of my understanding, the current scheme works like this:
> > drm_dp_mst_port and drm_dp_mst_branch both have a single refcount. When
> > this refcount hits 0 for either of the two, they're removed from the
> > topology state, but not immediately freed. Both ports and branch devices
> > will reinitialize their kref once it's hit 0 before actually destroying
> > themselves. The intended purpose behind this is so that we can avoid
> > problems like not being able to free a remote payload that might still
> > be active, due to us having removed all of the port/branch device
> > structures in memory, as per:
> >
> > 91a25e463130 ("drm/dp/mst: deallocate payload on port destruction")
> >
> > Which may have worked, but then it caused use-after-free errors. Being
> > new to MST at the time, I tried fixing it;
> >
> > 263efde31f97 ("drm/dp/mst: Get validated port ref in
> > drm_dp_update_payload_part1()")
> >
> > But, that was broken: both drm_dp_mst_port and drm_dp_mst_branch structs
> > are validated in almost every DP MST helper function. Simply put, this
> > means we go through the topology and try to see if the given
> > drm_dp_mst_branch or drm_dp_mst_port is still attached to something
> > before trying to use it in order to avoid dereferencing freed memory
> > (something that has happened a LOT in the past with this library).
> > Because of this it doesn't actually matter whether or not we keep keep
> > the ports and branches around in memory as that's not enough, because
> > any function that validates the branches and ports passed to it will
> > still reject them anyway since they're no longer in the topology
> > structure. So, use-after-free errors were fixed but payload deallocation
> > was completely broken.
> >
> > Two years later, AMD informed me about this issue and I attempted to
> > come up with a temporary fix, pending a long-overdue cleanup of this
> > library:
> >
> > c54c7374ff44 ("drm/dp_mst: Skip validating ports during destruction, just
> > ref")
> >
> > But then that introduced use-after-free errors, so I quickly reverted
> > it:
> >
> > 9765635b3075 ("Revert "drm/dp_mst: Skip validating ports during
> > destruction, just ref"")
> >
> > And in the process, learned that there is just no simple fix for this:
> > the design is just broken. Unfortuntely, the usage of these helpers are
> > quite broken as well. Some drivers like i915 have been smart enough to
> > avoid accessing any kind of information from MST port structures, but
> > others like nouveau have assumed, understandably so, that
> > drm_dp_mst_port structures are normal and can just be accessed at any
> > time without worrying about use-after-free errors.
> >
> > After a lot of discussion, me and Daniel Vetter came up with a better
> > idea to replace all of this.
> >
> > To summarize, since this is documented far more indepth in the
> > documentation this patch introduces, we make it so that drm_dp_mst_port
> > and drm_dp_mst_branch structures have two different classes of
> > refcounts: topology_kref, and malloc_kref. topology_kref corresponds to
> > the lifetime of the given drm_dp_mst_port or drm_dp_mst_branch in it's
> > given topology. Once it hits zero, any associated connectors are removed
> > and the branch or port can no longer be validated. malloc_kref
> > corresponds to the lifetime of the memory allocation for the actual
> > structure, and will always be non-zero so long as the topology_kref is
> > non-zero. This gives us a way to allow callers to hold onto port and
> > branch device structures past their topology lifetime, and dramatically
> > simplifies the lifetimes of both structures. This also finally fixes the
> > port deallocation problem, properly.
> >
> > Additionally: since this now means that we can keep ports and branch
> > devices allocated in memory for however long we need, we no longer need
> > a significant amount of the port validation that we currently do.
> >
> > Additionally, there is one last scenario that this fixes, which couldn't
> > have been fixed properly beforehand:
> >
> > - CPU1 unrefs port from topology (refcount 1->0)
> > - CPU2 refs port in topology(refcount 0->1)
> >
> > Since we now can guarantee memory safety for ports and branches
> > as-needed, we also can make our main reference counting functions fix
> > this problem by using kref_get_unless_zero() internally so that topology
> > refcounts can only ever reach 0 once.
> >
> > Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx>
> > Cc: Daniel Vetter <daniel@xxxxxxxx>
> > Cc: David Airlie <airlied@xxxxxxxxxx>
> > Cc: Jerry Zuo <Jerry.Zuo@xxxxxxx>
> > Cc: Harry Wentland <harry.wentland@xxxxxxx>
> > Cc: Juston Li <juston.li@xxxxxxxxx>
> > ---
> > .../gpu/dp-mst/topology-figure-1.dot | 31 ++
> > .../gpu/dp-mst/topology-figure-2.dot | 37 ++
> > .../gpu/dp-mst/topology-figure-3.dot | 40 ++
> > Documentation/gpu/drm-kms-helpers.rst | 125 ++++-
> > drivers/gpu/drm/drm_dp_mst_topology.c | 512 +++++++++++++-----
> > include/drm/drm_dp_mst_helper.h | 19 +-
> > 6 files changed, 637 insertions(+), 127 deletions(-)
> > create mode 100644 Documentation/gpu/dp-mst/topology-figure-1.dot
> > create mode 100644 Documentation/gpu/dp-mst/topology-figure-2.dot
> > create mode 100644 Documentation/gpu/dp-mst/topology-figure-3.dot
>
> Yay, docs, and pretty ones at that! Awesome stuff :-)
>
> > diff --git a/Documentation/gpu/dp-mst/topology-figure-1.dot
> > b/Documentation/gpu/dp-mst/topology-figure-1.dot
> > new file mode 100644
> > index 000000000000..fb83789e0a3e
> > --- /dev/null
> > +++ b/Documentation/gpu/dp-mst/topology-figure-1.dot
> > @@ -0,0 +1,31 @@
> > +digraph T {
> > + /* Topology references */
> > + node [shape=oval];
> > + mstb1 -> {port1, port2};
> > + port1 -> mstb2;
> > + port2 -> mstb3 -> {port3, port4};
> > + port3 -> mstb4;
> > +
> > + /* Malloc references */
> > + edge [style=dashed];
> > + mstb4 -> port3;
> > + {port4, port3} -> mstb3;
> > + mstb3 -> port2;
> > + mstb2 -> port1;
> > + {port1, port2} -> mstb1;
> > +
> > + edge [dir=back];
> > + node [style=filled;shape=box;fillcolor=lightblue];
> > + port1 -> "Payload #1";
> > + port3 -> "Payload #2";
> > +
> > + mstb1 [label="MSTB #1";style=filled;fillcolor=palegreen];
> > + mstb2 [label="MSTB #2";style=filled;fillcolor=palegreen];
> > + mstb3 [label="MSTB #3";style=filled;fillcolor=palegreen];
> > + mstb4 [label="MSTB #4";style=filled;fillcolor=palegreen];
> > +
> > + port1 [label="Port #1"];
> > + port2 [label="Port #2"];
> > + port3 [label="Port #3"];
> > + port4 [label="Port #4"];
> > +}
> > diff --git a/Documentation/gpu/dp-mst/topology-figure-2.dot
> > b/Documentation/gpu/dp-mst/topology-figure-2.dot
> > new file mode 100644
> > index 000000000000..eebce560be40
> > --- /dev/null
> > +++ b/Documentation/gpu/dp-mst/topology-figure-2.dot
> > @@ -0,0 +1,37 @@
> > +digraph T {
> > + /* Topology references */
> > + node [shape=oval];
> > +
> > + mstb1 -> {port1, port2};
> > + port1 -> mstb2;
> > + edge [color=red];
> > + port2 -> mstb3 -> {port3, port4};
> > + port3 -> mstb4;
> > + edge [color=""];
> > +
> > + /* Malloc references */
> > + edge [style=dashed];
> > + port3 -> mstb3;
> > + mstb3 -> port2;
> > + mstb2 -> port1;
> > + {port1, port2} -> mstb1;
> > + edge [color=red];
> > + mstb4 -> port3;
> > + port4 -> mstb3;
> > + edge [color=""];
> > +
> > + edge [dir=back];
> > + node [style=filled;shape=box;fillcolor=lightblue];
> > + port1 -> "Payload #1";
> > + port3 -> "Payload #2";
> > +
> > + mstb1 [label="MSTB #1";style=filled;fillcolor=palegreen];
> > + mstb2 [label="MSTB #2";style=filled;fillcolor=palegreen];
> > + mstb3 [label="MSTB #3";style=filled;fillcolor=palegreen];
> > + mstb4 [label="MSTB #4";style=filled;fillcolor=grey];
> > +
> > + port1 [label="Port #1"];
> > + port2 [label="Port #2"];
> > + port3 [label="Port #3"];
> > + port4 [label="Port #4";style=filled;fillcolor=grey];
> > +}
> > diff --git a/Documentation/gpu/dp-mst/topology-figure-3.dot
> > b/Documentation/gpu/dp-mst/topology-figure-3.dot
> > new file mode 100644
> > index 000000000000..9bf28d87144c
> > --- /dev/null
> > +++ b/Documentation/gpu/dp-mst/topology-figure-3.dot
> > @@ -0,0 +1,40 @@
> > +digraph T {
> > + /* Topology references */
> > + node [shape=oval];
> > +
> > + mstb1 -> {port1, port2};
> > + port1 -> mstb2;
> > + edge [color=grey];
> > + port2 -> mstb3 -> {port3, port4};
> > + port3 -> mstb4;
> > + edge [color=""];
> > +
> > + /* Malloc references */
> > + edge [style=dashed];
> > + port3 -> mstb3 [penwidth=3];
> > + mstb3 -> port2 [penwidth=3];
> > + mstb2 -> port1;
> > + {port1, port2} -> mstb1;
> > + edge [color=grey];
> > + mstb4 -> port3;
> > + port4 -> mstb3;
> > + edge [color=""];
> > +
> > + edge [dir=back];
> > + node [style=filled;shape=box;fillcolor=lightblue];
> > + port1 -> payload1;
> > + port3 -> payload2 [penwidth=3];
> > +
> > + mstb1 [label="MSTB #1";style=filled;fillcolor=palegreen];
> > + mstb2 [label="MSTB #2";style=filled;fillcolor=palegreen];
> > + mstb3 [label="MSTB #3";penwidth=3;style=filled;fillcolor=palegreen];
> > + mstb4 [label="MSTB #4";style=filled;fillcolor=grey];
> > +
> > + port1 [label="Port #1"];
> > + port2 [label="Port #2";penwidth=3];
> > + port3 [label="Port #3";penwidth=3];
> > + port4 [label="Port #4";style=filled;fillcolor=grey];
> > +
> > + payload1 [label="Payload #1"];
> > + payload2 [label="Payload #2";penwidth=3];
> > +}
> > diff --git a/Documentation/gpu/drm-kms-helpers.rst
> > b/Documentation/gpu/drm-kms-helpers.rst
> > index b422eb8edf16..c0f994c2c72f 100644
> > --- a/Documentation/gpu/drm-kms-helpers.rst
> > +++ b/Documentation/gpu/drm-kms-helpers.rst
> > @@ -208,8 +208,11 @@ Display Port Dual Mode Adaptor Helper Functions
> > Reference
> > .. kernel-doc:: drivers/gpu/drm/drm_dp_dual_mode_helper.c
> > :export:
> >
> > -Display Port MST Helper Functions Reference
> > -===========================================
> > +Display Port MST Helpers
> > +========================
> > +
> > +Functions Reference
> > +-------------------
> >
> > .. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
> > :doc: dp mst helper
> > @@ -220,6 +223,124 @@ Display Port MST Helper Functions Reference
> > .. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
> > :export:
> >
> > +Branch device and port refcounting
> > +----------------------------------
>
> I generally try to put the long-form explanations before the function
> references. Since usually the references completely drown out everything
> else and make it harder to spot the important overview stuff.
>
>
> > +
> > +Overview
> > +~~~~~~~~
> > +
> > +The refcounting schemes for :c:type:`struct drm_dp_mst_branch` and
> > +:c:type:`struct drm_dp_mst_port` are somewhat unusual. Both ports and
> > branch
> > +devices have two different kinds of refcounts: topology refcounts, and
> > malloc
> > +refcounts.
> > +
> > +Topology refcount overview
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Topology refcounts are not exposed to drivers, and are handled internally
> > by the
> > +DP MST helpers. The helpers use them in order to prevent the in-memory
> > topology
> > +state from being changed in the middle of critical operations like
> > changing the
> > +internal state of payload allocations. This means each branch and port
> > will be
> > +considered to be connected to the rest of the topology until it's
> > topology
> > +refcount reaches zero. Additionally, for ports this means that their
> > associated
> > +:c:type:`struct drm_connector` will stay registered with userspace until
> > the
> > +port's refcount reaches 0.
> > +
> > +
> > +Topology refcount functions
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The DP MST helpers use the following functions to manage topology
> > refcounts:
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
> > + :functions: drm_dp_mst_topology_get_port drm_dp_mst_topology_put_port
> > + drm_dp_mst_topology_ref_port drm_dp_mst_topology_get_mstb
> > + drm_dp_mst_topology_put_mstb drm_dp_mst_topology_ref_mstb
> > +
> > +Malloc refcount overview
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Malloc references are used to keep a :c:type:`struct drm_dp_mst_port` or
> > +:c:type:`struct drm_dp_mst_branch` allocated even after all of its
> > topology
> > +references have been dropped, so that the driver or MST helpers can
> > safely
> > +access each branch's last known state before it was disconnected from the
> > +topology. When the malloc refcount of a port or branch reaches 0, the
> > memory
> > +allocation containing the :c:type:`struct drm_dp_mst_branch` or
> > :c:type:`struct
> > +drm_dp_mst_port` respectively will be freed.
> > +
> > +Malloc refcounts for ports
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +For :c:type:`struct drm_dp_mst_port`, malloc refcounts are exposed to
> > drivers
> > +through the following functions:
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
> > + :functions: drm_dp_mst_get_port_malloc drm_dp_mst_put_port_malloc
> > +
> > +Malloc refcounts for branch devices
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +For :c:type:`struct drm_dp_mst_branch`, malloc refcounts are not
> > currently
> > +exposed to drivers. As of writing this documentation, there are no
> > drivers that
> > +have a usecase for accessing :c:type:`struct drm_dp_mst_branch` outside
> > of the
> > +MST helpers. Exposing this API to drivers in a race-free manner would
> > take more
> > +tweaking of the refcounting scheme, however patches are welcome provided
> > there
> > +is a legitimate driver usecase for this.
> > +
> > +Internally, malloc refcounts for :c:type:`struct drm_dp_mst_branch` are
> > managed
> > +by the DP MST core through the following functions:
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
> > + :functions: drm_dp_mst_get_mstb_malloc drm_dp_mst_put_mstb_malloc
> > +
> > +Refcount relationships in a topology
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Let's take a look at why the relationship between topology and malloc
> > refcounts
> > +is designed the way it is.
> > +
> > +.. kernel-figure:: dp-mst/topology-figure-1.dot
> > +
> > + An example of topology and malloc refs in a DP MST topology with two
> > active
> > + payloads. Topology refcount increments are indicated by solid lines,
> > and
> > + malloc refcount increments are indicated by dashed lines. Each starts
> > from
> > + the branch which incremented the refcount, and ends at the branch to
> > which
> > + the refcount belongs to.
> > +
> > +As you can see in figure 1, every branch increments the topology
> > +refcount of it's children, and increments the malloc refcount of it's
> > parent.
> > +Additionally, every payload increments the malloc refcount of it's
> > assigned port
> > +by 1.
> > +
> > +So, what would happen if MSTB #3 from the above figure was unplugged from
> > the
> > +system, but the driver hadn't yet removed payload #2 from port #3? The
> > topology
> > +would start to look like figure 2.
> > +
> > +.. kernel-figure:: dp-mst/topology-figure-2.dot
> > +
> > + Ports and branch devices which have been released from memory are
> > colored
> > + grey, and references which have been removed are colored red.
> > +
> > +Whenever a port or branch device's topology refcount reaches zero, it
> > will
> > +decrement the topology refcounts of all its children, the malloc refcount
> > of its
> > +parent, and finally its own malloc refcount. For MSTB #4 and port #4,
> > this means
> > +they both have been disconnected from the topology and freed from memory.
> > But,
> > +because payload #2 is still holding a reference to port #3, port #3 is
> > removed
> > +from the topology but it's :c:type:`struct drm_dp_mst_port` is still
> > accessible
> > +from memory. This also means port #3 has not yet decremented the malloc
> > refcount
> > +of MSTB #3, so it's :c:type:`struct drm_dp_mst_branch` will also stay
> > allocated
> > +in memory until port #3's malloc refcount reaches 0.
> > +
> > +This relationship is necessary because in order to release payload #2, we
> > +need to be able to figure out the last relative of port #3 that's still
> > +connected to the topology. In this case, we would travel up the topology
> > as
> > +shown in figure 3.
> > +
> > +.. kernel-figure:: dp-mst/topology-figure-3.dot
> > +
> > +And finally, remove payload #2 by communicating with port #2 through
> > sideband
> > +transactions.
>
> (Blind guess, I haven't looked ahead in the series yet)
>
> I assume that drivers also want to hold a malloc reference from their
> connector, until that connector is destroyed completed (and we hence know
> it released all its vcpi and other stuff and really doesn't need the port
> anymore). Could we integrated that into these neat graphs too? Answering
> the "so how does this integrate into my driver?" question is imo the most
> important part for core api docs.
>
> Another one: Any reason for not putting this right into the code as a DOC:
> section? Ime moving docs as close as possible to the code improves the
> odds it's kept up-to-date. The only overview texts I've left in the .rst
> is the stuff that describes overall concepts (e.g. how all the kms objects
> fit together).
>
> All the sphinx/rst syntax should carry over 1:1, except in kerneldoc you
> also can benefit from the abbreviated reference syntax from kerneldoc.
>
> Anyway, really great stuff.
>
> > +
> > MIPI DSI Helper Functions Reference
> > ===================================
> >
> > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/drm_dp_mst_topology.c
> > index 2ab16c9e6243..c196fb580beb 100644
> > --- a/drivers/gpu/drm/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c
> > @@ -46,7 +46,7 @@ static bool dump_dp_payload_table(struct
> > drm_dp_mst_topology_mgr *mgr,
> > char *buf);
> > static int test_calc_pbn_mode(void);
> >
> > -static void drm_dp_put_port(struct drm_dp_mst_port *port);
> > +static void drm_dp_mst_topology_put_port(struct drm_dp_mst_port *port);
> >
> > static int drm_dp_dpcd_write_payload(struct drm_dp_mst_topology_mgr *mgr,
> > int id,
> > @@ -850,46 +850,120 @@ static struct drm_dp_mst_branch
> > *drm_dp_add_mst_branch_device(u8 lct, u8 *rad)
> > if (lct > 1)
> > memcpy(mstb->rad, rad, lct / 2);
> > INIT_LIST_HEAD(&mstb->ports);
> > - kref_init(&mstb->kref);
> > + kref_init(&mstb->topology_kref);
> > + kref_init(&mstb->malloc_kref);
> > return mstb;
> > }
> >
> > static void drm_dp_free_mst_port(struct kref *kref);
> > +static void drm_dp_free_mst_branch_device(struct kref *kref);
>
> I'd move the functions around, forward declarations for static functions
> is a bit silly
>
> > +
> > +/**
> > + * drm_dp_mst_get_mstb_malloc() - Increment the malloc refcount of a
> > branch
> > + * device
> > + * @mstb: The &struct drm_dp_mst_branch to increment the malloc refcount
> > of
> > + *
> > + * Increments @mstb.malloc_kref. When @mstb.malloc_kref reaches 0, the
> > memory
>
> s/@/&/ for structure member references. @ references to parameters/members
> in the same kerneldoc type only. With & you'll get a nice link, @ is just
> markup (and yes & with a member unfortunately doesn't link to the member,
> only the overall structure).
>
> Similarly below.
>
> > + * allocation for @mstb will be released and @mstb may no longer be used.
> > + *
> > + * Any malloc references acquired with this function must be released
> > when
> > + * they are no longer being used by calling drm_dp_mst_put_mstb_malloc().
>
> I'd dropped "when they are no longer being used", and the line below too.
> Short docs are better generally because attention span of readers.
>
> > + *
> > + * See also: drm_dp_mst_put_mstb_malloc()
> > + */
> > +static void
> > +drm_dp_mst_get_mstb_malloc(struct drm_dp_mst_branch *mstb)
> > +{
> > + kref_get(&mstb->malloc_kref);
> > + DRM_DEBUG("mstb %p (%d)\n", mstb, kref_read(&mstb->malloc_kref));
> > +}
> > +
> > +/**
> > + * drm_dp_mst_put_mstb_malloc() - Decrement the malloc refcount of a
> > branch
> > + * device
> > + * @mstb: The &struct drm_dp_mst_branch to decrement the malloc refcount
> > of
> > + *
> > + * Decrements @mstb.malloc_kref. When @mstb.malloc_kref reaches 0, the
> > memory
> > + * allocation for @mstb will be released and @mstb may no longer be used.
> > + *
> > + * See also: drm_dp_mst_get_mstb_malloc()
> > + */
> > +static void
> > +drm_dp_mst_put_mstb_malloc(struct drm_dp_mst_branch *mstb)
> > +{
> > + DRM_DEBUG("mstb %p (%d)\n", mstb, kref_read(&mstb->malloc_kref)-1);
> > + kref_put(&mstb->malloc_kref, drm_dp_free_mst_branch_device);
> > +}
> > +
> > +/**
> > + * drm_dp_mst_get_port_malloc() - Increment the malloc refcount of an MST
> > port
> > + * @port: The &struct drm_dp_mst_port to increment the malloc refcount of
> > + *
> > + * Increments @port.malloc_kref. When @port.malloc_kref reaches 0, the
> > memory
> > + * allocation for @port will be released and @port may no longer be used.
> > + *
> > + * Because @port could potentially be freed at any time by the DP MST
> > helpers
> > + * if @port.malloc_kref reaches 0, including during a call to this
> > function,
> > + * drivers that which to make use of &struct drm_dp_mst_port should
> > ensure
> > + * that they grab at least one main malloc reference to their MST ports
> > in
> > + * &drm_dp_mst_topology_cbs.add_connector. This callback is called before
> > + * there is any chance for @port.malloc_kref to reach 0.
> > + *
> > + * Any malloc references acquired with this function must be released
> > when
> > + * they are no longer being used by calling drm_dp_mst_put_port_malloc().
> > + *
> > + * See also: drm_dp_mst_put_port_malloc()
>
> Same reduction as with mstb_malloc version.
>
> > + */
> > +void
> > +drm_dp_mst_get_port_malloc(struct drm_dp_mst_port *port)
> > +{
> > + kref_get(&port->malloc_kref);
> > + DRM_DEBUG("port %p (%d)\n", port, kref_read(&port->malloc_kref));
> > +}
> > +EXPORT_SYMBOL(drm_dp_mst_get_port_malloc);
> > +
> > +/**
> > + * drm_dp_mst_put_port_malloc() - Decrement the malloc refcount of an MST
> > port
> > + * @port: The &struct drm_dp_mst_port to decrement the malloc refcount of
> > + *
> > + * Decrements @port.malloc_kref. When @port.malloc_kref reaches 0, the
> > memory
> > + * allocation for @port will be released and @port may no longer be used.
> > + *
> > + * See also: drm_dp_mst_get_port_malloc()
> > + */
> > +void
> > +drm_dp_mst_put_port_malloc(struct drm_dp_mst_port *port)
> > +{
> > + DRM_DEBUG("port %p (%d)\n", port, kref_read(&port->malloc_kref)-1);
> > + kref_put(&port->malloc_kref, drm_dp_free_mst_port);
> > +}
> > +EXPORT_SYMBOL(drm_dp_mst_put_port_malloc);
> >
> > static void drm_dp_free_mst_branch_device(struct kref *kref)
> > {
> > - struct drm_dp_mst_branch *mstb = container_of(kref, struct
> > drm_dp_mst_branch, kref);
> > - if (mstb->port_parent) {
> > - if (list_empty(&mstb->port_parent->next))
> > - kref_put(&mstb->port_parent->kref,
> > drm_dp_free_mst_port);
> > - }
> > + struct drm_dp_mst_branch *mstb =
> > + container_of(kref, struct drm_dp_mst_branch, malloc_kref);
> > +
> > + if (mstb->port_parent)
> > + drm_dp_mst_put_port_malloc(mstb->port_parent);
> > +
> > kfree(mstb);
> > }
> >
> > static void drm_dp_destroy_mst_branch_device(struct kref *kref)
> > {
> > - struct drm_dp_mst_branch *mstb = container_of(kref, struct
> > drm_dp_mst_branch, kref);
> > + struct drm_dp_mst_branch *mstb =
> > + container_of(kref, struct drm_dp_mst_branch, topology_kref);
> > + struct drm_dp_mst_topology_mgr *mgr = mstb->mgr;
> > struct drm_dp_mst_port *port, *tmp;
> > bool wake_tx = false;
> >
> > - /*
> > - * init kref again to be used by ports to remove mst branch when it is
> > - * not needed anymore
> > - */
> > - kref_init(kref);
> > -
> > - if (mstb->port_parent && list_empty(&mstb->port_parent->next))
> > - kref_get(&mstb->port_parent->kref);
> > -
> > - /*
> > - * destroy all ports - don't need lock
> > - * as there are no more references to the mst branch
> > - * device at this point.
> > - */
> > + mutex_lock(&mgr->lock);
> > list_for_each_entry_safe(port, tmp, &mstb->ports, next) {
> > list_del(&port->next);
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > }
> > + mutex_unlock(&mgr->lock);
>
> Would be nice to split this out (to highlight the bugfix more), but
> because of the kref_init() hack not really feasible I think :-/
> >
> > /* drop any tx slots msg */
> > mutex_lock(&mstb->mgr->qlock);
> > @@ -908,14 +982,82 @@ static void drm_dp_destroy_mst_branch_device(struct
> > kref *kref)
> > if (wake_tx)
> > wake_up_all(&mstb->mgr->tx_waitq);
> >
> > - kref_put(kref, drm_dp_free_mst_branch_device);
> > + drm_dp_mst_put_mstb_malloc(mstb);
> > }
> >
> > -static void drm_dp_put_mst_branch_device(struct drm_dp_mst_branch *mstb)
> > +/**
> > + * drm_dp_mst_topology_get_mstb() - Increment the topology refcount of a
> > + * branch device unless its zero
> > + * @mstb: &struct drm_dp_mst_branch to increment the topology refcount of
> > + *
> > + * Attempts to grab a topology reference to @mstb, if it hasn't yet been
> > + * removed from the topology (e.g. @mstb.topology_kref has reached 0).
> > + *
> > + * Any topology references acquired with this function must be released
> > when
> > + * they are no longer being used by calling
> > drm_dp_mst_topology_put_mstb().
>
> I'd explain the relationship with malloc_kref a bit here:
>
> - topology ref implies a malloc ref, hence you can call get_mstb_malloc
> with only holding a topology ref (might be better to explain this in the
> get_mstb_malloc kerneldoc, since it also applies to the unconditional
> kref_get below)
> - malloc_ref is enough to call this function, but then it can fail
>
> > + *
> > + * See also:
> > + * drm_dp_mst_topology_ref_mstb()
>
> I'd write out when you should use this one instead:
>
> "If you already have a topology reference you should use other_function()
> instead."
>
> > + * drm_dp_mst_topology_get_mstb()
>
> This is this function itself :-)
>
> > + *
> > + * Returns:
> > + * * 1: A topology reference was grabbed successfully
> > + * * 0: @port is no longer in the topology, no reference was grabbed
> > + */
> > +static int __must_check
> > +drm_dp_mst_topology_get_mstb(struct drm_dp_mst_branch *mstb)
>
> Hm if you both want a kref_get and a kref_get_unless_zero then we need
> better naming. topology_get_mstb should be the unconditional kref_get, the
> conditional kref_get_unless_zero needs some indication that it could fail.
> We need some verb that indicates that instead of "get":
> - "validate" since we've used that one already
> - "lookup" that's used by all the drm_mode_object lookup functions, feels
> a bit misleading
> - "try_get"
>
> > {
> > - kref_put(&mstb->kref, drm_dp_destroy_mst_branch_device);
> > + int ret = kref_get_unless_zero(&mstb->topology_kref);
> > +
> > + if (ret)
> > + DRM_DEBUG("mstb %p (%d)\n", mstb,
> > + kref_read(&mstb->topology_kref));
> > +
> > + return ret;
> > +}
> > +
> > +/**
> > + * drm_dp_mst_topology_ref_mstb() - Increment the topology refcount of a
> > + * branch device
> > + * @mstb: The &struct drm_dp_mst_branch to increment the topology
> > refcount of
> > + *
> > + * Increments @mstb.topology_refcount without checking whether or not
> > it's
> > + * already reached 0. This is only valid to use in scenarios where you
> > are
> > + * already guaranteed to have at least one active topology reference to
> > @mstb.
> > + * Otherwise, drm_dp_mst_topology_get_mstb() should be used.
>
> s/should/must/ (or my English understanding is off, afaiui "should" isn't
> a strict requirement per rfc2119)
>
> > + *
> > + * Any topology references acquired with this function must be released
> > when
> > + * they are no longer being used by calling
> > drm_dp_mst_topology_put_mstb().
> > + *
> > + * See also:
> > + * drm_dp_mst_topology_get_mstb()
> > + * drm_dp_mst_topology_put_mstb()
> > + */
> > +static void
> > +drm_dp_mst_topology_ref_mstb(struct drm_dp_mst_branch *mstb)
> > +{
>
> Should we have a WARN_ON(refcount == 0) here?
>
> > + kref_get(&mstb->topology_kref);
> > + DRM_DEBUG("mstb %p (%d)\n", mstb, kref_read(&mstb->topology_kref));
> > }
> >
> > +/**
> > + * drm_dp_mst_topology_put_mstb() - release a topology reference to a
> > branch
> > + * device
> > + * @mstb: The &struct drm_dp_mst_branch to release the topology reference
> > from
> > + *
> > + * Releases a topology reference from @mstb by decrementing
> > + * @mstb.topology_kref.
> > + *
> > + * See also:
> > + * drm_dp_mst_topology_get_mstb()
> > + * drm_dp_mst_topology_ref_mstb()
> > + */
> > +static void
> > +drm_dp_mst_topology_put_mstb(struct drm_dp_mst_branch *mstb)
> > +{
> > + DRM_DEBUG("mstb %p (%d)\n", mstb, kref_read(&mstb->topology_kref)-1);
> > + kref_put(&mstb->topology_kref, drm_dp_destroy_mst_branch_device);
> > +}
> >
> > static void drm_dp_port_teardown_pdt(struct drm_dp_mst_port *port, int
> > old_pdt)
> > {
> > @@ -930,14 +1072,15 @@ static void drm_dp_port_teardown_pdt(struct
> > drm_dp_mst_port *port, int old_pdt)
> > case DP_PEER_DEVICE_MST_BRANCHING:
> > mstb = port->mstb;
> > port->mstb = NULL;
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> > break;
> > }
> > }
> >
> > static void drm_dp_destroy_port(struct kref *kref)
> > {
> > - struct drm_dp_mst_port *port = container_of(kref, struct
> > drm_dp_mst_port, kref);
> > + struct drm_dp_mst_port *port =
> > + container_of(kref, struct drm_dp_mst_port, topology_kref);
> > struct drm_dp_mst_topology_mgr *mgr = port->mgr;
> >
> > if (!port->input) {
> > @@ -956,7 +1099,6 @@ static void drm_dp_destroy_port(struct kref *kref)
> > * from an EDID retrieval */
> >
> > mutex_lock(&mgr->destroy_connector_lock);
> > - kref_get(&port->parent->kref);
> > list_add(&port->next, &mgr->destroy_connector_list);
> > mutex_unlock(&mgr->destroy_connector_lock);
> > schedule_work(&mgr->destroy_connector_work);
> > @@ -967,25 +1109,93 @@ static void drm_dp_destroy_port(struct kref *kref)
> > drm_dp_port_teardown_pdt(port, port->pdt);
> > port->pdt = DP_PEER_DEVICE_NONE;
> > }
> > - kfree(port);
> > + drm_dp_mst_put_port_malloc(port);
> > }
> >
> > -static void drm_dp_put_port(struct drm_dp_mst_port *port)
> > +/**
> > + * drm_dp_mst_topology_get_port() - Increment the topology refcount of a
> > + * port unless its zero
> > + * @port: &struct drm_dp_mst_port to increment the topology refcount of
> > + *
> > + * Attempts to grab a topology reference to @port, if it hasn't yet been
> > + * removed from the topology (e.g. @port.topology_kref has reached 0).
> > + *
> > + * Any topology references acquired with this function must be released
> > when
> > + * they are no longer being used by calling
> > drm_dp_mst_topology_put_port().
> > + *
> > + * See also:
> > + * drm_dp_mst_topology_ref_port()
> > + * drm_dp_mst_topology_put_port()
> > + *
> > + * Returns:
> > + * * 1: A topology reference was grabbed successfully
> > + * * 0: @port is no longer in the topology, no reference was grabbed
> > + */
> > +static int __must_check
> > +drm_dp_mst_topology_get_port(struct drm_dp_mst_port *port)
> > {
> > - kref_put(&port->kref, drm_dp_destroy_port);
> > + int ret = kref_get_unless_zero(&port->topology_kref);
> > +
> > + if (ret)
> > + DRM_DEBUG("port %p (%d)\n", port,
> > + kref_read(&port->topology_kref));
> > +
> > + return ret;
> > }
> >
> > -static struct drm_dp_mst_branch
> > *drm_dp_mst_get_validated_mstb_ref_locked(struct drm_dp_mst_branch *mstb,
> > struct drm_dp_mst_branch *to_find)
> > +/**
> > + * drm_dp_mst_topology_ref_port() - Increment the topology refcount of a
> > port
> > + * @port: The &struct drm_dp_mst_port to increment the topology refcount
> > of
> > + *
> > + * Increments @port.topology_refcount without checking whether or not
> > it's
> > + * already reached 0. This is only valid to use in scenarios where you
> > are
> > + * already guaranteed to have at least one active topology reference to
> > @port.
> > + * Otherwise, drm_dp_mst_topology_get_port() should be used.
> > + *
> > + * Any topology references acquired with this function must be released
> > when
> > + * they are no longer being used by calling
> > drm_dp_mst_topology_put_port().
> > + *
> > + * See also:
> > + * drm_dp_mst_topology_get_port()
> > + * drm_dp_mst_topology_put_port()
> > + */
> > +static void drm_dp_mst_topology_ref_port(struct drm_dp_mst_port *port)
> > +{
> > + kref_get(&port->topology_kref);
> > + DRM_DEBUG("port %p (%d)\n", port, kref_read(&port->topology_kref));
> > +}
> > +
> > +/**
> > + * drm_dp_mst_topology_put_port() - release a topology reference to a
> > port
> > + * @port: The &struct drm_dp_mst_port to release the topology reference
> > from
> > + *
> > + * Releases a topology reference from @port by decrementing
> > + * @port.topology_kref.
> > + *
> > + * See also:
> > + * drm_dp_mst_topology_get_port()
> > + * drm_dp_mst_topology_ref_port()
> > + */
> > +static void drm_dp_mst_topology_put_port(struct drm_dp_mst_port *port)
> > +{
> > + DRM_DEBUG("port %p (%d)\n", port, kref_read(&port->topology_kref)-1);
> > + kref_put(&port->topology_kref, drm_dp_destroy_port);
> > +}
> > +
> > +static struct drm_dp_mst_branch *
> > +drm_dp_mst_topology_get_mstb_validated_locked(struct drm_dp_mst_branch
> > *mstb,
> > + struct drm_dp_mst_branch
> > *to_find)
> > {
> > struct drm_dp_mst_port *port;
> > struct drm_dp_mst_branch *rmstb;
> > - if (to_find == mstb) {
> > - kref_get(&mstb->kref);
> > +
> > + if (to_find == mstb)
> > return mstb;
> > - }
> > +
> > list_for_each_entry(port, &mstb->ports, next) {
> > if (port->mstb) {
> > - rmstb = drm_dp_mst_get_validated_mstb_ref_locked(port-
> > >mstb, to_find);
> > + rmstb = drm_dp_mst_topology_get_mstb_validated_locked(
>
> I think a prep patch which just renames the current get_validated/put
> functions to the new names would be really good. Then this patch here with
> the new stuff.
>
>
> > + port->mstb, to_find);
> > if (rmstb)
> > return rmstb;
> > }
> > @@ -993,27 +1203,37 @@ static struct drm_dp_mst_branch
> > *drm_dp_mst_get_validated_mstb_ref_locked(struct
> > return NULL;
> > }
> >
> > -static struct drm_dp_mst_branch *drm_dp_get_validated_mstb_ref(struct
> > drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_branch *mstb)
> > +static struct drm_dp_mst_branch *
> > +drm_dp_mst_topology_get_mstb_validated(struct drm_dp_mst_topology_mgr
> > *mgr,
> > + struct drm_dp_mst_branch *mstb)
> > {
> > struct drm_dp_mst_branch *rmstb = NULL;
> > +
> > mutex_lock(&mgr->lock);
> > - if (mgr->mst_primary)
> > - rmstb = drm_dp_mst_get_validated_mstb_ref_locked(mgr-
> > >mst_primary, mstb);
> > + if (mgr->mst_primary) {
> > + rmstb = drm_dp_mst_topology_get_mstb_validated_locked(
> > + mgr->mst_primary, mstb);
> > +
> > + if (rmstb && !drm_dp_mst_topology_get_mstb(rmstb))
> > + rmstb = NULL;
> > + }
> > mutex_unlock(&mgr->lock);
> > return rmstb;
> > }
> >
> > -static struct drm_dp_mst_port *drm_dp_mst_get_port_ref_locked(struct
> > drm_dp_mst_branch *mstb, struct drm_dp_mst_port *to_find)
> > +static struct drm_dp_mst_port *
> > +drm_dp_mst_topology_get_port_validated_locked(struct drm_dp_mst_branch
> > *mstb,
> > + struct drm_dp_mst_port *to_find)
> > {
> > struct drm_dp_mst_port *port, *mport;
> >
> > list_for_each_entry(port, &mstb->ports, next) {
> > - if (port == to_find) {
> > - kref_get(&port->kref);
> > + if (port == to_find)
> > return port;
> > - }
> > +
> > if (port->mstb) {
> > - mport = drm_dp_mst_get_port_ref_locked(port->mstb,
> > to_find);
> > + mport = drm_dp_mst_topology_get_port_validated_locked(
> > + port->mstb, to_find);
> > if (mport)
> > return mport;
> > }
> > @@ -1021,12 +1241,20 @@ static struct drm_dp_mst_port
> > *drm_dp_mst_get_port_ref_locked(struct drm_dp_mst_
> > return NULL;
> > }
> >
> > -static struct drm_dp_mst_port *drm_dp_get_validated_port_ref(struct
> > drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port)
> > +static struct drm_dp_mst_port *
> > +drm_dp_mst_topology_get_port_validated(struct drm_dp_mst_topology_mgr
> > *mgr,
> > + struct drm_dp_mst_port *port)
> > {
> > struct drm_dp_mst_port *rport = NULL;
> > +
> > mutex_lock(&mgr->lock);
> > - if (mgr->mst_primary)
> > - rport = drm_dp_mst_get_port_ref_locked(mgr->mst_primary,
> > port);
> > + if (mgr->mst_primary) {
> > + rport = drm_dp_mst_topology_get_port_validated_locked(
> > + mgr->mst_primary, port);
> > +
> > + if (rport && !drm_dp_mst_topology_get_port(rport))
> > + rport = NULL;
> > + }
> > mutex_unlock(&mgr->lock);
> > return rport;
> > }
> > @@ -1034,11 +1262,12 @@ static struct drm_dp_mst_port
> > *drm_dp_get_validated_port_ref(struct drm_dp_mst_t
> > static struct drm_dp_mst_port *drm_dp_get_port(struct drm_dp_mst_branch
> > *mstb, u8 port_num)
> > {
> > struct drm_dp_mst_port *port;
> > + int ret;
> >
> > list_for_each_entry(port, &mstb->ports, next) {
> > if (port->port_num == port_num) {
> > - kref_get(&port->kref);
> > - return port;
> > + ret = drm_dp_mst_topology_get_port(port);
> > + return ret ? port : NULL;
> > }
> > }
> >
> > @@ -1087,6 +1316,11 @@ static bool drm_dp_port_setup_pdt(struct
> > drm_dp_mst_port *port)
> > if (port->mstb) {
> > port->mstb->mgr = port->mgr;
> > port->mstb->port_parent = port;
> > + /*
> > + * Make sure this port's memory allocation stays
> > + * around until it's child MSTB releases it
> > + */
> > + drm_dp_mst_get_port_malloc(port);
> >
> > send_link = true;
> > }
> > @@ -1147,17 +1381,26 @@ static void drm_dp_add_port(struct
> > drm_dp_mst_branch *mstb,
> > bool created = false;
> > int old_pdt = 0;
> > int old_ddps = 0;
> > +
> > port = drm_dp_get_port(mstb, port_msg->port_number);
> > if (!port) {
> > port = kzalloc(sizeof(*port), GFP_KERNEL);
> > if (!port)
> > return;
> > - kref_init(&port->kref);
> > + kref_init(&port->topology_kref);
> > + kref_init(&port->malloc_kref);
> > port->parent = mstb;
> > port->port_num = port_msg->port_number;
> > port->mgr = mstb->mgr;
> > port->aux.name = "DPMST";
> > port->aux.dev = dev->dev;
> > +
> > + /*
> > + * Make sure the memory allocation for our parent branch stays
> > + * around until our own memory allocation is released
> > + */
> > + drm_dp_mst_get_mstb_malloc(mstb);
> > +
> > created = true;
> > } else {
> > old_pdt = port->pdt;
> > @@ -1177,7 +1420,7 @@ static void drm_dp_add_port(struct drm_dp_mst_branch
> > *mstb,
> > for this list */
> > if (created) {
> > mutex_lock(&mstb->mgr->lock);
> > - kref_get(&port->kref);
> > + drm_dp_mst_topology_ref_port(port);
> > list_add(&port->next, &mstb->ports);
> > mutex_unlock(&mstb->mgr->lock);
> > }
> > @@ -1202,17 +1445,21 @@ static void drm_dp_add_port(struct
> > drm_dp_mst_branch *mstb,
> > if (created && !port->input) {
> > char proppath[255];
> >
> > - build_mst_prop_path(mstb, port->port_num, proppath,
> > sizeof(proppath));
> > - port->connector = (*mstb->mgr->cbs->add_connector)(mstb->mgr,
> > port, proppath);
> > + build_mst_prop_path(mstb, port->port_num, proppath,
> > + sizeof(proppath));
> > + port->connector = (*mstb->mgr->cbs->add_connector)(mstb->mgr,
> > + port,
> > + proppath);
> > if (!port->connector) {
> > /* remove it from the port list */
> > mutex_lock(&mstb->mgr->lock);
> > list_del(&port->next);
> > mutex_unlock(&mstb->mgr->lock);
> > /* drop port list reference */
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > goto out;
> > }
> > +
> > if ((port->pdt == DP_PEER_DEVICE_DP_LEGACY_CONV ||
> > port->pdt == DP_PEER_DEVICE_SST_SINK) &&
> > port->port_num >= DP_MST_LOGICAL_PORT_0) {
> > @@ -1224,7 +1471,7 @@ static void drm_dp_add_port(struct drm_dp_mst_branch
> > *mstb,
> >
> > out:
> > /* put reference to this port */
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > }
> >
> > static void drm_dp_update_port(struct drm_dp_mst_branch *mstb,
> > @@ -1259,7 +1506,7 @@ static void drm_dp_update_port(struct
> > drm_dp_mst_branch *mstb,
> > dowork = true;
> > }
> >
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > if (dowork)
> > queue_work(system_long_wq, &mstb->mgr->work);
> >
> > @@ -1270,7 +1517,7 @@ static struct drm_dp_mst_branch
> > *drm_dp_get_mst_branch_device(struct drm_dp_mst_
> > {
> > struct drm_dp_mst_branch *mstb;
> > struct drm_dp_mst_port *port;
> > - int i;
> > + int i, ret;
> > /* find the port by iterating down */
> >
> > mutex_lock(&mgr->lock);
> > @@ -1295,7 +1542,9 @@ static struct drm_dp_mst_branch
> > *drm_dp_get_mst_branch_device(struct drm_dp_mst_
> > }
> > }
> > }
> > - kref_get(&mstb->kref);
> > + ret = drm_dp_mst_topology_get_mstb(mstb);
> > + if (!ret)
> > + mstb = NULL;
> > out:
> > mutex_unlock(&mgr->lock);
> > return mstb;
> > @@ -1325,19 +1574,22 @@ static struct drm_dp_mst_branch
> > *get_mst_branch_device_by_guid_helper(
> > return NULL;
> > }
> >
> > -static struct drm_dp_mst_branch *drm_dp_get_mst_branch_device_by_guid(
> > - struct drm_dp_mst_topology_mgr *mgr,
> > - uint8_t *guid)
> > +static struct drm_dp_mst_branch *
> > +drm_dp_get_mst_branch_device_by_guid(struct drm_dp_mst_topology_mgr *mgr,
> > + uint8_t *guid)
> > {
> > struct drm_dp_mst_branch *mstb;
> > + int ret;
> >
> > /* find the port by iterating down */
> > mutex_lock(&mgr->lock);
> >
> > mstb = get_mst_branch_device_by_guid_helper(mgr->mst_primary, guid);
> > -
> > - if (mstb)
> > - kref_get(&mstb->kref);
> > + if (mstb) {
> > + ret = drm_dp_mst_topology_get_mstb(mstb);
> > + if (!ret)
> > + mstb = NULL;
> > + }
> >
> > mutex_unlock(&mgr->lock);
> > return mstb;
> > @@ -1362,10 +1614,10 @@ static void
> > drm_dp_check_and_send_link_address(struct drm_dp_mst_topology_mgr *m
> > drm_dp_send_enum_path_resources(mgr, mstb, port);
> >
> > if (port->mstb) {
> > - mstb_child = drm_dp_get_validated_mstb_ref(mgr, port-
> > >mstb);
> > + mstb_child =
> > drm_dp_mst_topology_get_mstb_validated(mgr, port->mstb);
> > if (mstb_child) {
> > drm_dp_check_and_send_link_address(mgr,
> > mstb_child);
> > - drm_dp_put_mst_branch_device(mstb_child);
> > + drm_dp_mst_topology_put_mstb(mstb_child);
> > }
> > }
> > }
> > @@ -1375,16 +1627,19 @@ static void drm_dp_mst_link_probe_work(struct
> > work_struct *work)
> > {
> > struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct
> > drm_dp_mst_topology_mgr, work);
> > struct drm_dp_mst_branch *mstb;
> > + int ret;
> >
> > mutex_lock(&mgr->lock);
> > mstb = mgr->mst_primary;
> > if (mstb) {
> > - kref_get(&mstb->kref);
> > + ret = drm_dp_mst_topology_get_mstb(mstb);
> > + if (!ret)
> > + mstb = NULL;
> > }
> > mutex_unlock(&mgr->lock);
> > if (mstb) {
> > drm_dp_check_and_send_link_address(mgr, mstb);
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> > }
> > }
> >
> > @@ -1695,22 +1950,32 @@ static struct drm_dp_mst_port
> > *drm_dp_get_last_connected_port_to_mstb(struct drm
> > return drm_dp_get_last_connected_port_to_mstb(mstb->port_parent-
> > >parent);
> > }
> >
> > -static struct drm_dp_mst_branch
> > *drm_dp_get_last_connected_port_and_mstb(struct drm_dp_mst_topology_mgr
> > *mgr,
> > - struc
> > t drm_dp_mst_branch *mstb,
> > - int
> > *port_num)
> > +static struct drm_dp_mst_branch *
> > +drm_dp_get_last_connected_port_and_mstb(struct drm_dp_mst_topology_mgr
> > *mgr,
> > + struct drm_dp_mst_branch *mstb,
> > + int *port_num)
> > {
> > struct drm_dp_mst_branch *rmstb = NULL;
> > struct drm_dp_mst_port *found_port;
> > +
> > mutex_lock(&mgr->lock);
> > - if (mgr->mst_primary) {
> > + if (!mgr->mst_primary)
> > + goto out;
> > +
> > + do {
> > found_port = drm_dp_get_last_connected_port_to_mstb(mstb);
> > + if (!found_port)
> > + break;
> >
> > - if (found_port) {
> > + if (drm_dp_mst_topology_get_mstb(found_port->parent)) {
> > rmstb = found_port->parent;
> > - kref_get(&rmstb->kref);
> > *port_num = found_port->port_num;
> > + } else {
> > + /* Search again, starting from this parent */
> > + mstb = found_port->parent;
> > }
> > - }
> > + } while (!rmstb);
>
> Hm, is this a bugfix of validating the entire chain? Afaiui the new
> topology_get still validates the entire chain, so I'm a bit confused what
> this does here.
JFYI: I'm assuming you meant the old get_validated() functions. I mentioned in
the cover letter for this series that I wasn't sure if we still needed them,
but on closer inspection I think we still do since they perform the actual
validation of the whole topology chain. drm_dp_mst_topology_get_(port|mstb)()
just increments the topology refcount safely.
The change you're seeing here is because since we didn't use
kref_get_unless_zero() before, we'd just go up the topology path above mstb(),
then kref the first thing we find that we think is still connected to the
topology (I honestly don't know how/if this ever worked), then give it a kref
and return it. Now that we use kref_get_unless_zero(), we have to deal with
the fact that the kref could fail, which would happen if we just retrieved a
parent mstb or port that is also disconnected from the topology. So, the only
way to do that is to find what we think is the last connected mstb, check if
it actually is, then restart the search from that mstb if the kref failed and
it's not connected to the topology.
That being said, I've been wondering about figuring out spots like this where
we probably also need to follow that up with "also make sure all of the
parents of this 'connected' topology device are also valid", since it's quite
possible we could run into a scenario like this:
Step 1:
MSTB 1
|- Port 1
|- Port 2
|- Port 3
|- MSTB 2 â (just unplugged, top refcount == 0)
|- Port 4 â (also unplugged, but top refcount not updated yet)
|- Port 5 â (same thing ^)
|- Port 6 â (same thing ^)
|- MSTB 3
|- Port 7
|- Port 8
|- Port 9
^
drm_dp_get_last_connected_port_to_mstb()
travels up to Port 6, assumes Port 6 is valid because it's top refcount
!= 0
Now that I type all of that out though, I think we could also fix that fairly
easily by instead just adding a topology_state mutex, and adding a variable to
denote whether or not a port is actually still part of a topology.
Maybe that also means we should come up with a different name for
topology_refcount, resource_refcount maybe?
>
> > +out:
> > mutex_unlock(&mgr->lock);
> > return rmstb;
> > }
> > @@ -1726,17 +1991,19 @@ static int drm_dp_payload_send_msg(struct
> > drm_dp_mst_topology_mgr *mgr,
> > u8 sinks[DRM_DP_MAX_SDP_STREAMS];
> > int i;
> >
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return -EINVAL;
> >
> > port_num = port->port_num;
> > - mstb = drm_dp_get_validated_mstb_ref(mgr, port->parent);
> > + mstb = drm_dp_mst_topology_get_mstb_validated(mgr, port->parent);
> > if (!mstb) {
> > - mstb = drm_dp_get_last_connected_port_and_mstb(mgr, port-
> > >parent, &port_num);
> > + mstb = drm_dp_get_last_connected_port_and_mstb(mgr,
> > + port->parent,
> > + &port_num);
> >
> > if (!mstb) {
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return -EINVAL;
> > }
> > }
> > @@ -1766,8 +2033,8 @@ static int drm_dp_payload_send_msg(struct
> > drm_dp_mst_topology_mgr *mgr,
> > }
> > kfree(txmsg);
> > fail_put:
> > - drm_dp_put_mst_branch_device(mstb);
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_mstb(mstb);
> > + drm_dp_mst_topology_put_port(port);
> > return ret;
> > }
> >
> > @@ -1777,13 +2044,13 @@ int drm_dp_send_power_updown_phy(struct
> > drm_dp_mst_topology_mgr *mgr,
> > struct drm_dp_sideband_msg_tx *txmsg;
> > int len, ret;
> >
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return -EINVAL;
> >
> > txmsg = kzalloc(sizeof(*txmsg), GFP_KERNEL);
> > if (!txmsg) {
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return -ENOMEM;
> > }
> >
> > @@ -1799,7 +2066,7 @@ int drm_dp_send_power_updown_phy(struct
> > drm_dp_mst_topology_mgr *mgr,
> > ret = 0;
> > }
> > kfree(txmsg);
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> >
> > return ret;
> > }
> > @@ -1888,7 +2155,8 @@ int drm_dp_update_payload_part1(struct
> > drm_dp_mst_topology_mgr *mgr)
> > if (vcpi) {
> > port = container_of(vcpi, struct drm_dp_mst_port,
> > vcpi);
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr,
> > + port);
> > if (!port) {
> > mutex_unlock(&mgr->payload_lock);
> > return -EINVAL;
> > @@ -1925,7 +2193,7 @@ int drm_dp_update_payload_part1(struct
> > drm_dp_mst_topology_mgr *mgr)
> > cur_slots += req_payload.num_slots;
> >
> > if (port)
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > }
> >
> > for (i = 0; i < mgr->max_payloads; i++) {
> > @@ -2024,7 +2292,7 @@ static int drm_dp_send_dpcd_write(struct
> > drm_dp_mst_topology_mgr *mgr,
> > struct drm_dp_sideband_msg_tx *txmsg;
> > struct drm_dp_mst_branch *mstb;
> >
> > - mstb = drm_dp_get_validated_mstb_ref(mgr, port->parent);
> > + mstb = drm_dp_mst_topology_get_mstb_validated(mgr, port->parent);
> > if (!mstb)
> > return -EINVAL;
> >
> > @@ -2048,7 +2316,7 @@ static int drm_dp_send_dpcd_write(struct
> > drm_dp_mst_topology_mgr *mgr,
> > }
> > kfree(txmsg);
> > fail_put:
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> > return ret;
> > }
> >
> > @@ -2158,7 +2426,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> >
> > /* give this the main reference */
> > mgr->mst_primary = mstb;
> > - kref_get(&mgr->mst_primary->kref);
> > + drm_dp_mst_topology_ref_mstb(mgr->mst_primary);
> >
> > ret = drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL,
> > DP_MST_EN |
> > DP_UP_REQ_EN | DP_UPSTREAM_IS_SRC);
> > @@ -2192,7 +2460,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct
> > drm_dp_mst_topology_mgr *mgr, bool ms
> > out_unlock:
> > mutex_unlock(&mgr->lock);
> > if (mstb)
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> > return ret;
> >
> > }
> > @@ -2357,7 +2625,7 @@ static int drm_dp_mst_handle_down_rep(struct
> > drm_dp_mst_topology_mgr *mgr)
> > mgr->down_rep_recv.initial_hdr.lct,
> > mgr->down_rep_recv.initial_hdr.rad[0],
> > mgr->down_rep_recv.msg[0]);
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> > memset(&mgr->down_rep_recv, 0, sizeof(struct
> > drm_dp_sideband_msg_rx));
> > return 0;
> > }
> > @@ -2368,7 +2636,7 @@ static int drm_dp_mst_handle_down_rep(struct
> > drm_dp_mst_topology_mgr *mgr)
> > }
> >
> > memset(&mgr->down_rep_recv, 0, sizeof(struct
> > drm_dp_sideband_msg_rx));
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> >
> > mutex_lock(&mgr->qlock);
> > txmsg->state = DRM_DP_SIDEBAND_TX_RX;
> > @@ -2441,7 +2709,7 @@ static int drm_dp_mst_handle_up_req(struct
> > drm_dp_mst_topology_mgr *mgr)
> > }
> >
> > if (mstb)
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> >
> > memset(&mgr->up_req_recv, 0, sizeof(struct
> > drm_dp_sideband_msg_rx));
> > }
> > @@ -2501,7 +2769,7 @@ enum drm_connector_status
> > drm_dp_mst_detect_port(struct drm_connector *connector
> > enum drm_connector_status status = connector_status_disconnected;
> >
> > /* we need to search for the port in the mgr in case its gone */
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return connector_status_disconnected;
> >
> > @@ -2526,7 +2794,7 @@ enum drm_connector_status
> > drm_dp_mst_detect_port(struct drm_connector *connector
> > break;
> > }
> > out:
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return status;
> > }
> > EXPORT_SYMBOL(drm_dp_mst_detect_port);
> > @@ -2543,11 +2811,11 @@ bool drm_dp_mst_port_has_audio(struct
> > drm_dp_mst_topology_mgr *mgr,
> > {
> > bool ret = false;
> >
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return ret;
> > ret = port->has_audio;
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return ret;
> > }
> > EXPORT_SYMBOL(drm_dp_mst_port_has_audio);
> > @@ -2567,7 +2835,7 @@ struct edid *drm_dp_mst_get_edid(struct
> > drm_connector *connector, struct drm_dp_
> > struct edid *edid = NULL;
> >
> > /* we need to search for the port in the mgr in case its gone */
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return NULL;
> >
> > @@ -2578,7 +2846,7 @@ struct edid *drm_dp_mst_get_edid(struct
> > drm_connector *connector, struct drm_dp_
> > drm_connector_set_tile_property(connector);
> > }
> > port->has_audio = drm_detect_monitor_audio(edid);
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return edid;
> > }
> > EXPORT_SYMBOL(drm_dp_mst_get_edid);
> > @@ -2649,7 +2917,7 @@ int drm_dp_atomic_find_vcpi_slots(struct
> > drm_atomic_state *state,
> > if (IS_ERR(topology_state))
> > return PTR_ERR(topology_state);
> >
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (port == NULL)
> > return -EINVAL;
> > req_slots = DIV_ROUND_UP(pbn, mgr->pbn_div);
> > @@ -2657,14 +2925,14 @@ int drm_dp_atomic_find_vcpi_slots(struct
> > drm_atomic_state *state,
> > req_slots, topology_state->avail_slots);
> >
> > if (req_slots > topology_state->avail_slots) {
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return -ENOSPC;
> > }
> >
> > topology_state->avail_slots -= req_slots;
> > DRM_DEBUG_KMS("vcpi slots avail=%d", topology_state->avail_slots);
> >
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return req_slots;
> > }
> > EXPORT_SYMBOL(drm_dp_atomic_find_vcpi_slots);
> > @@ -2715,7 +2983,7 @@ bool drm_dp_mst_allocate_vcpi(struct
> > drm_dp_mst_topology_mgr *mgr,
> > {
> > int ret;
> >
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return false;
> >
> > @@ -2725,7 +2993,7 @@ bool drm_dp_mst_allocate_vcpi(struct
> > drm_dp_mst_topology_mgr *mgr,
> > if (port->vcpi.vcpi > 0) {
> > DRM_DEBUG_KMS("payload: vcpi %d already allocated for pbn %d -
> > requested pbn %d\n", port->vcpi.vcpi, port->vcpi.pbn, pbn);
> > if (pbn == port->vcpi.pbn) {
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return true;
> > }
> > }
> > @@ -2733,13 +3001,13 @@ bool drm_dp_mst_allocate_vcpi(struct
> > drm_dp_mst_topology_mgr *mgr,
> > ret = drm_dp_init_vcpi(mgr, &port->vcpi, pbn, slots);
> > if (ret) {
> > DRM_DEBUG_KMS("failed to init vcpi slots=%d max=63 ret=%d\n",
> > - DIV_ROUND_UP(pbn, mgr->pbn_div), ret);
> > + DIV_ROUND_UP(pbn, mgr->pbn_div), ret);
> > goto out;
> > }
> > DRM_DEBUG_KMS("initing vcpi for pbn=%d slots=%d\n",
> > - pbn, port->vcpi.num_slots);
> > + pbn, port->vcpi.num_slots);
> >
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return true;
> > out:
> > return false;
> > @@ -2749,12 +3017,12 @@ EXPORT_SYMBOL(drm_dp_mst_allocate_vcpi);
> > int drm_dp_mst_get_vcpi_slots(struct drm_dp_mst_topology_mgr *mgr, struct
> > drm_dp_mst_port *port)
> > {
> > int slots = 0;
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return slots;
> >
> > slots = port->vcpi.num_slots;
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > return slots;
> > }
> > EXPORT_SYMBOL(drm_dp_mst_get_vcpi_slots);
> > @@ -2768,11 +3036,11 @@ EXPORT_SYMBOL(drm_dp_mst_get_vcpi_slots);
> > */
> > void drm_dp_mst_reset_vcpi_slots(struct drm_dp_mst_topology_mgr *mgr,
> > struct drm_dp_mst_port *port)
> > {
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return;
> > port->vcpi.num_slots = 0;
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > }
> > EXPORT_SYMBOL(drm_dp_mst_reset_vcpi_slots);
> >
> > @@ -2781,9 +3049,10 @@ EXPORT_SYMBOL(drm_dp_mst_reset_vcpi_slots);
> > * @mgr: manager for this port
> > * @port: unverified port to deallocate vcpi for
> > */
> > -void drm_dp_mst_deallocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
> > struct drm_dp_mst_port *port)
> > +void drm_dp_mst_deallocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
> > + struct drm_dp_mst_port *port)
> > {
> > - port = drm_dp_get_validated_port_ref(mgr, port);
> > + port = drm_dp_mst_topology_get_port_validated(mgr, port);
> > if (!port)
> > return;
> >
> > @@ -2792,7 +3061,7 @@ void drm_dp_mst_deallocate_vcpi(struct
> > drm_dp_mst_topology_mgr *mgr, struct drm_
> > port->vcpi.pbn = 0;
> > port->vcpi.aligned_pbn = 0;
> > port->vcpi.vcpi = 0;
> > - drm_dp_put_port(port);
> > + drm_dp_mst_topology_put_port(port);
> > }
> > EXPORT_SYMBOL(drm_dp_mst_deallocate_vcpi);
> >
> > @@ -3078,8 +3347,10 @@ static void drm_dp_tx_work(struct work_struct
> > *work)
> >
> > static void drm_dp_free_mst_port(struct kref *kref)
> > {
> > - struct drm_dp_mst_port *port = container_of(kref, struct
> > drm_dp_mst_port, kref);
> > - kref_put(&port->parent->kref, drm_dp_free_mst_branch_device);
> > + struct drm_dp_mst_port *port =
> > + container_of(kref, struct drm_dp_mst_port, malloc_kref);
> > +
> > + drm_dp_mst_put_mstb_malloc(port->parent);
> > kfree(port);
> > }
> >
> > @@ -3103,7 +3374,6 @@ static void drm_dp_destroy_connector_work(struct
> > work_struct *work)
> > list_del(&port->next);
> > mutex_unlock(&mgr->destroy_connector_lock);
> >
> > - kref_init(&port->kref);
> > INIT_LIST_HEAD(&port->next);
> >
> > mgr->cbs->destroy_connector(mgr, port->connector);
> > @@ -3117,7 +3387,7 @@ static void drm_dp_destroy_connector_work(struct
> > work_struct *work)
> > drm_dp_mst_put_payload_id(mgr, port->vcpi.vcpi);
> > }
> >
> > - kref_put(&port->kref, drm_dp_free_mst_port);
> > + drm_dp_mst_put_port_malloc(port);
> > send_hotplug = true;
> > }
> > if (send_hotplug)
> > @@ -3292,7 +3562,7 @@ static int drm_dp_mst_i2c_xfer(struct i2c_adapter
> > *adapter, struct i2c_msg *msgs
> > struct drm_dp_sideband_msg_tx *txmsg = NULL;
> > int ret;
> >
> > - mstb = drm_dp_get_validated_mstb_ref(mgr, port->parent);
> > + mstb = drm_dp_mst_topology_get_mstb_validated(mgr, port->parent);
> > if (!mstb)
> > return -EREMOTEIO;
> >
> > @@ -3342,7 +3612,7 @@ static int drm_dp_mst_i2c_xfer(struct i2c_adapter
> > *adapter, struct i2c_msg *msgs
> > }
> > out:
> > kfree(txmsg);
> > - drm_dp_put_mst_branch_device(mstb);
> > + drm_dp_mst_topology_put_mstb(mstb);
> > return ret;
> > }
> >
> > diff --git a/include/drm/drm_dp_mst_helper.h
> > b/include/drm/drm_dp_mst_helper.h
> > index 371cc2816477..50643a39765d 100644
> > --- a/include/drm/drm_dp_mst_helper.h
> > +++ b/include/drm/drm_dp_mst_helper.h
> > @@ -44,7 +44,10 @@ struct drm_dp_vcpi {
> >
> > /**
> > * struct drm_dp_mst_port - MST port
> > - * @kref: reference count for this port.
> > + * @topology_kref: refcount for this port's lifetime in the topology,
> > only the
> > + * DP MST helpers should need to touch this
> > + * @malloc_kref: refcount for the memory allocation containing this
> > structure.
> > + * See drm_dp_mst_get_port_malloc() and drm_dp_mst_put_port_malloc().
> > * @port_num: port number
> > * @input: if this port is an input port.
> > * @mcs: message capability status - DP 1.2 spec.
> > @@ -67,7 +70,8 @@ struct drm_dp_vcpi {
> > * in the MST topology.
> > */
> > struct drm_dp_mst_port {
> > - struct kref kref;
> > + struct kref topology_kref;
> > + struct kref malloc_kref;
>
> I'd to inline member kerneldoc here (you can mix&match, so no need to
> rewrite them all) and spend a few words reference the family of get/put
> functions. Same for mstb below.
>
> >
> > u8 port_num;
> > bool input;
> > @@ -102,7 +106,10 @@ struct drm_dp_mst_port {
> >
> > /**
> > * struct drm_dp_mst_branch - MST branch device.
> > - * @kref: reference count for this port.
> > + * @topology_kref: refcount for this branch device's lifetime in the
> > topology,
> > + * only the DP MST helpers should need to touch this
> > + * @malloc_kref: refcount for the memory allocation containing this
> > structure.
> > + * See drm_dp_mst_get_mstb_malloc() and drm_dp_mst_put_mstb_malloc().
> > * @rad: Relative Address to talk to this branch device.
> > * @lct: Link count total to talk to this branch device.
> > * @num_ports: number of ports on the branch.
> > @@ -121,7 +128,8 @@ struct drm_dp_mst_port {
> > * to downstream port of parent branches.
> > */
> > struct drm_dp_mst_branch {
> > - struct kref kref;
> > + struct kref topology_kref;
> > + struct kref malloc_kref;
> > u8 rad[8];
> > u8 lct;
> > int num_ports;
> > @@ -626,4 +634,7 @@ int drm_dp_atomic_release_vcpi_slots(struct
> > drm_atomic_state *state,
> > int drm_dp_send_power_updown_phy(struct drm_dp_mst_topology_mgr *mgr,
> > struct drm_dp_mst_port *port, bool power_up);
> >
> > +void drm_dp_mst_get_port_malloc(struct drm_dp_mst_port *port);
> > +void drm_dp_mst_put_port_malloc(struct drm_dp_mst_port *port);
> > +
> > #endif
> > --
> > 2.19.2
>
> I really like. Mostly concentrated on looking at the docs. Also still
> need to apply it and build the docs, so I can appreciate the DOT graphs.
> -Daniel
--
Cheers,
Lyude Paul