Re: [PATCH] usb: gadget: u_serial: Add null pointer checks after RX/TX submission

From: Kuen-Han Tsai
Date: Mon Jul 15 2024 - 11:33:42 EST


Hi Jiri,

Sorry for the late reply, I've finally been able to revisit this issue.

On Thu, Mar 28, 2024 at 5:02 PM Jiri Slaby <jirislaby@xxxxxxxxxx> wrote:
>
> On 08. 03. 24, 12:47, Kuen-Han Tsai wrote:
> > Hi Greg & Jiri,
> >
> > On Sun, Jan 28, 2024 at 9:29 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> On Thu, Jan 18, 2024 at 10:27:54AM +0100, Jiri Slaby wrote:
> >>> On 16. 01. 24, 15:16, Kuen-Han Tsai wrote:
> >>>> Commit ffd603f21423 ("usb: gadget: u_serial: Add null pointer check in
> >>>> gs_start_io") adds null pointer checks to gs_start_io(), but it doesn't
> >>>> fully fix the potential null pointer dereference issue. While
> >>>> gserial_connect() calls gs_start_io() with port_lock held, gs_start_rx()
> >>>> and gs_start_tx() release the lock during endpoint request submission.
> >>>> This creates a window where gs_close() could set port->port_tty to NULL,
> >>>> leading to a dereference when the lock is reacquired.
> >>>>
> >>>> This patch adds a null pointer check for port->port_tty after RX/TX
> >>>> submission, and removes the initial null pointer check in gs_start_io()
> >>>> since the caller must hold port_lock and guarantee non-null values for
> >>>> port_usb and port_tty.
> >>>
> >>> Or you switch to tty_port refcounting and need not fiddling with this at all
> >>> ;).
> >>
> >> I agree, Kuen-Han, why not do that instead?
> >
> > The u_serial driver has already maintained the usage count of a TTY
> > structure for open and close. While the driver tracks the usage count
> > via open/close, it doesn't fully eliminate race conditions. Below are
> > two potential scenarios:
> >
> > Case 1 (Observed):
> > 1. gs_open() sets usage count to 1.
> > 2. gserial_connect(), gs_start_io(), and gs_start_rx() execute in
> > sequence (lock held).
> > 3. Lock released, usb_ep_queue() called.
> > 4. In parallel, gs_close() executes, sees count of 1, clears TTY, releases lock.
> > 5. Original thread resumes in gs_start_rx(), potentially leading to
> > kernel panic on an invalid TTY.
>
> If it used refcounting -- tty_port_tty_get(), how comes?

If gs_start_rx() uses refcounting, the race problem will still persist
because gs_close() currently decides to reset port->port.tty to NULL
only when port->port.count reaches one (i.e. last open), without
checking if the associated TTY instance is still in use. I am
uncertain if you are suggesting that I should modify gs_close() by
considering the refcount of a tty instance? I'm still unsure how to
fix this race problem by using refcounting. I'd greatly appreciate it
if you could provide more detailed guidance on how to resolve this
issue, as I'm not very experienced with TTY drivers.

Also, I noticed a typo in my earlier emails, including the commit
message. It should be port->port.tty instead of port->port_tty.

Regards,
Kuen-Han