Re: [PATCHv13 2/3] usb: USB Type-C connector class
From: Greg KH
Date: Tue Nov 29 2016 - 11:27:45 EST
On Thu, Nov 24, 2016 at 02:21:43PM +0200, Heikki Krogerus wrote:
> The purpose of USB Type-C connector class is to provide
> unified interface for the user space to get the status and
> basic information about USB Type-C connectors on a system,
> control over data role swapping, and when the port supports
> USB Power Delivery, also control over power role swapping
> and Alternate Modes.
Ok, something is really "odd" about how you are abusing the driver model
here, just by how the release functions are written. So, let me try to
figure it out...
I'm cutting and pasting all over the place, hopefully it makes some
sense:
> --- /dev/null
> +++ b/Documentation/usb/typec.txt
> @@ -0,0 +1,110 @@
> +USB Type-C connector class
> +==========================
> +
> +Introduction
> +------------
> +The typec class is meant for describing the USB Type-C ports in a system to the
> +user space in unified fashion. The class is designed to provide nothing else
> +except the user space interface implementation in hope that it can be utilized
> +on as many platforms as possible.
> +
> +The platforms are expected to register every USB Type-C port they have with the
> +class. In a normal case the registration will be done by a USB Type-C or PD PHY
> +driver, but it may be a driver for firmware interface such as UCSI, driver for
> +USB PD controller or even driver for Thunderbolt3 controller. This document
> +considers the component registering the USB Type-C ports with the class as "port
> +driver".
> +
> +On top of showing the capabilities, the class also offer the user space control
> +over the roles and alternate modes they support when the port driver is capable
> +of supporting those features.
> +
> +The class provides an API for the port drivers described in this document. The
> +attributes are described in Documentation/ABI/testing/sysfs-class-typec.
> +
> +
> +Interface
> +---------
> +Every port will be presented as its own device under /sys/class/typec/. The
> +first port will be named "usbc0", the second "usbc1" and so on.
> +
> +When connected, the partner will be presented also as its own device under
> +/sys/class/typec/. The parent of the partner device will always be the port. The
> +partner attached to port "usbc0" will be named "usbc0-partner". Full path to the
> +device would be /sys/class/typec/usb0/usb0-partner/.
> +
> +The cable and the two plugs on it may also be optionally presented as their own
> +devices under /sys/class/typec/. The cable attached to the port "usbc0" port
> +will be named usbc0-cable and the plug on the SOP Prime end (see USB Power
> +Delivery Specification ch. 2.4) will be named "usbc-plug0" and on the SOP Double
> +Prime end "usbc0-plug1". The parent of a cable will always be the port, and the
> +parent of the cable plugs will always be the cable.
> +
> +If the port, partner or cable plug support Alternate Modes, every Alternate Mode
> +SVID will have their own device describing them. The Alternate Modes will not be
> +attached to the typec class. For the port's "usbc0" partner, the Alternate Modes
> +would have devices presented under /sys/class/typec/usbc0-partner/. Every mode
> +that is supported will have its own group under the Alternate Mode device named
> +"mode<id>". For example /sys/class/typec/usbc0/usbc0.svid:xxxx/mode0/. The
> +requests for entering/exiting the modes happens with the "active" attribute in
> +that group.
> +
> +
> +API
> +---
> +
> +* Registering the ports
> +
> +The port drivers will describe every Type-C port they control with struct
> +typec_capability data structure, and register them with the following API:
> +
> +struct typec_port *typec_register_port(struct device *dev,
> + const struct typec_capability *cap);
'dev' should be 'parent' to make things a bit more obvious. Ok, that
looks good, and the code there seems semi-sane:
> +struct typec_port *typec_register_port(struct device *dev,
> + const struct typec_capability *cap)
> +{
> + struct typec_port *port;
> + int ret;
> + int id;
> +
> + port = kzalloc(sizeof(*port), GFP_KERNEL);
> + if (!port)
> + return ERR_PTR(-ENOMEM);
Minor nit, who really cares? Just return NULL and get on with it :)
> +
> + id = ida_simple_get(&typec_index_ida, 0, 0, GFP_KERNEL);
> + if (id < 0) {
> + kfree(port);
> + return ERR_PTR(id);
> + }
> +
> + port->prefer_role = cap->prefer_role;
> +
> + port->id = id;
> + port->cap = cap;
> + port->dev.type = &typec_port_dev_type;
> + port->dev.class = &typec_class;
> + port->dev.parent = dev;
> + dev_set_name(&port->dev, "usbc%d", id);
> +
> + typec_init_roles(port);
> +
> + ret = device_register(&port->dev);
> + if (ret)
> + goto reg_err;
Hm, you just announced this device to the world, which is fine, but do
you really want to do that? Are all attributes properly set up and
ready to be read? I think this is true, and your release function for
this type of device is great, as it actually frees the memory involved,
so good, nice work.
So let's move on, to what caused me to go "hmmm..."
> +* Notifications
> +
> +When connection happens on a port, the port driver fills struct typec_connection
> +which is passed to the class. The class provides the following API for reporting
> +connection/disconnection:
typec_connection is "odd":
> +/*
> + * struct typec_connection - Details about USB Type-C port connection event
> + * @partner: The attached partner
> + * @cable: The attached cable
> + * @data_role: Initial USB data role (host or device)
> + * @pwr_role: Initial Power role (source or sink)
> + * @vconn_role: Initial VCONN role (source or sink)
> + * @pwr_opmode: The power mode of the connection
> + *
> + * All the relevant details about a connection event. Wrapper that is passed to
> + * typec_connect(). The context is copied when typec_connect() is called and the
> + * structure is not used for anything else.
> + */
> +struct typec_connection {
> + struct typec_partner *partner;
> + struct typec_cable *cable;
> +
> + enum typec_data_role data_role;
> + enum typec_role pwr_role;
> + enum typec_role vconn_role;
> + enum typec_pwr_opmode pwr_opmode;
> +};
> +
You have a partner, and a cable structure pointer. Yet, who created
those structures? I see who registers them, but who "owns" them?
Obviously, not this code, because of this release call for the
"partner":
> +static void typec_partner_release(struct device *dev)
> +{
> + struct typec_port *port = to_typec_port(dev->parent);
> +
> + port->partner = NULL;
> +}
What did you do here? You just ignored the struct device, didn't free
anything, and am just "hoping" someone else will do it.
Ick ick ick.
Then your typec_cable release function:
> +static void typec_plug_release(struct device *dev)
> +{
> + struct typec_plug *plug = to_typec_plug(dev);
> +
> + memset(plug, 0, sizeof(*plug));
> +}
Um, no. NEVER zero out a struct device, you just re-initialized it in a
back-door-way-guaranteed-to-cause-oopses-and-memory-loss. It's worse
than just ignoring it.
So, what is the lifecycle for these devices? Who "owns" them? Why
doesn't the core own them? It is responsible for registering and doing
stuff with them, yet it doesn't create or free them? That's a reciepie
for disaster.
I guess I should have been harsher when I saw your empty release
functions and just assumed you would fix this up properly. This code is
totally wrong and needs to be reworked to correctly manage the lifecycle
of the struct devices being used here. Don't paper over kernel warnings
by providing empty release functions, and then, when I complain about
them, provide code that is even worse!
You are getting lucky that this code seems to work, but it really
doesn't. Go read the section in CodingStyle about reference counting of
objects. The driver model does this for you, but you can't just ignore
reality and go "hopefully someone else will free the memory when we are
done", like you are doing here. Actually it's worse as the device here
is now gone, yet you are lying and saying "hey driver core, someone else
is going to free it up because it 'knows' it's now released, because we
lied about who owns this device, it really wasn't you".
Ok, I just tried to look at the code more, and got even more worried...
Look at this mess:
> +/*
> + * struct typec_plug - USB Type-C Cable Plug
> + * @dev: struct device instance
> + * @index: 1 for the plug connected to DFP and 2 for the plug connected to UFP
> + * @alt_modes: Alternate Modes the cable plug supports (null terminated)
> + *
> + * Represents USB Type-C Cable Plug.
> + */
> +struct typec_plug {
> + struct device dev;
> + int index;
> + struct typec_altmode *alt_modes;
> +};
Ok, a plug is a struct device, nice...
> +
> +/*
> + * struct typec_cable - USB Type-C Cable
> + * @dev: struct device instance
> + * @type: The plug type from USB PD Cable VDO
> + * @usb_pd: Electronically Marked Cable
> + * @active: Is the cable active or passive
> + * @sop_pp_controller: Tells whether both cable plugs are configurable or not
> + * @plug: The two plugs in the cable.
> + *
> + * Represents USB Type-C Cable attached to USB Type-C port. Two plugs are
> + * created if the cable has SOP Double Prime controller as defined in USB PD
> + * specification. Otherwise only one will be created if the cable is active. For
> + * passive cables no plugs are created.
> + */
> +struct typec_cable {
> + struct device dev;
> + enum typec_plug_type type;
> + u32 vdo;
> + unsigned int usb_pd:1;
> + unsigned int active:1;
> + unsigned int sop_pp_controller:1;
> +
> + struct typec_plug plug[2];
WTF???
Think about what this structure now represents. You have 3 different
reference counted objects, all embedded in the same structure. Who
"owns" the lifecycle of it? What happens if plug[1]'s reference count
is grabbed a bunch by someone reading a lot of files for it, and then
the "larger" typec_cable.dev reference count is decremented to 0 because
the core is done with it. oops, boom, ick, splat, and if you are lucky
the device reboots itself, if not, someone just got root and read your
bank account information...
{sigh}
I'm being harsh here because this is really really really badly designed
code. Go back and think about your data structures, what they are
trying to represent, and _WHO_ owns and controls them. The typec core
should be the one that allocates and manages the lifecycle of them, not
some random external entity where you just hope and pray that they got
it right (hint, they can not as they do not know what the core did with
the reference counts.)
Right now you are almost there, the typec core registers and tries to
manage the structures, but it doesn't allocate/free them, and that's the
big problem, because really, with a structure that has 3 different
reference counts, no one can. That's an impossibility.
This needs a lot more work, sorry.
I'm now going to require that you get other internal Intel developers to
sign off on this code before I review it again. You have resources at
your disposal that others do not with your internal mailing lists
containing senior kernel developers. Use it and don't waste the
community's time to do basic code review that they should be doing
instead.
How did we get to a v13 of this patch series without anyone else even
seeing this? That's worrysome as well...
greg k-h