Re: [net,v4] serial: caif: hold tty->link reference in ldisc_open and ser_release
From: Shuangpeng
Date: Wed Mar 04 2026 - 20:23:44 EST
> On Mar 3, 2026, at 20:26, Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> serial: caif: hold tty->link reference in ldisc_open and ser_release
>
> This commit addresses a KASAN slab-use-after-free in pty_write_room()
> triggered when caif_serial's TX path calls tty_write_room(). The fix
> holds an extra reference on tty->link throughout the lifetime of the
> caif_serial line discipline.
>
> Link: https://lore.kernel.org/all/20260228094741.1e248271@xxxxxxxxxx/
> Link: https://gist.github.com/shuangpengbai/c898debad6bdf170a84be7e6b3d8707f
>
> The Fixes tag points to commit 308e7e4d0a84 ("serial: caif: fix
> use-after-free in caif_serial ldisc_close()"), but the regression was
> actually introduced by commit e31d5a05948e ("caif: tty's are kref
> objects so take a reference"). That commit added tty_kref_get(tty) to
> hold a reference on the tty, but failed to also hold a reference on
> tty->link, which is accessed via tty_write_room() in handle_tx().
>
> Commit 308e7e4d0a84 merely exposed this existing regression by changing
> the timing.
>
> Should the Fixes tag be updated to:
>
> Fixes: e31d5a05948e ("caif: tty's are kref objects so take a reference")
>
Thanks for confirming the correct Fixes commit. I will update the Fixes
tag in the next revision.
>> diff --git a/drivers/net/caif/caif_serial.c b/drivers/net/caif/caif_serial.c
>> index b90890030751..1873d8287bb9 100644
>> --- a/drivers/net/caif/caif_serial.c
>> +++ b/drivers/net/caif/caif_serial.c
>> @@ -297,6 +297,7 @@ static void ser_release(struct work_struct *work)
>> dev_close(ser->dev);
>> unregister_netdevice(ser->dev);
>> debugfs_deinit(ser);
>> + tty_kref_put(tty->link);
> ^^^^^^^^^^^^^^^^^^^^
>
> Can this dereference NULL? The tty->link field is only set for PTY
> (pseudo-terminal) pairs. For regular serial ports and other non-PTY
> devices, tty->link is NULL.
>
> The CAIF line discipline can be attached to any TTY device via the
> TIOCSETD ioctl. When ser_release() is called to clean up a ser_device
> that was created for a non-PTY TTY device, tty->link would be NULL.
>
> Should this check for NULL before calling tty_kref_put()?
>
tty_kref_get() / tty_kref_put() are NULL-safe, so calling them with
tty->link == NULL is not a functional issue.
I can add explicit if (tty->link) guards to improve readability if you prefer.
Please let me know if you would like me to include those checks.
This also applies to the other concerns in the patch where tty->link may
be NULL.
>> tty_kref_put(tty);
>> }
>> rtnl_unlock();
>> @@ -331,6 +332,7 @@ static int ldisc_open(struct tty_struct *tty)
>>
>> ser = netdev_priv(dev);
>> ser->tty = tty_kref_get(tty);
>> + tty_kref_get(tty->link);
> ^^^^^^^^^^^^^^^^^^^^^^^^
>
> Can this dereference NULL? When a user attaches the CAIF line discipline
> to a non-PTY TTY device via TIOCSETD, ldisc_open() is called with a tty
> where tty->link is NULL.
>
> While ldisc_open() performs capability checks (CAP_SYS_ADMIN), there is
> no code restricting this to PTY devices only.
>
> Should this check for NULL before calling tty_kref_get()?
>
>> ser->dev = dev;
>> debugfs_init(ser, tty);
>> tty->receive_room = 4096;
>> @@ -339,6 +341,7 @@ static int ldisc_open(struct tty_struct *tty)
>> rtnl_lock();
>> result = register_netdevice(dev);
>> if (result) {
>> + tty_kref_put(tty->link);
> ^^^^^^^^^^^^^^^^^^^^^^^^
>
> Can this dereference NULL in the error path? If register_netdevice()
> fails after the CAIF line discipline was attached to a non-PTY TTY
> device, the error path would attempt to release a reference on
> tty->link when it is NULL.
>
> Should this check for NULL before calling tty_kref_put()?
>
>> tty_kref_put(tty);
>> rtnl_unlock();
>> free_netdev(dev);
>
> Simon Horman raised a concern in v2 about a NULL pointer dereference in
> update_tty_status(). The v4 patch does not use the spinlock approach
> from v2, and does not address this race condition.
>
> After ser_release() sets ser->tty = NULL, a concurrent handle_tx() that
> already passed the NULL check can reach update_tty_status(ser).
>
> Looking at update_tty_status():
>
> drivers/net/caif/caif_serial.c:update_tty_status() {
> inline_decode(ser->common.flowctrl,
> ser->tty->flow.stopped << 5 |
> ser->tty->flow.tco_stopped << 3 |
> ser->tty->ctrl.packet << 2);
> }
>
> It dereferences ser->tty directly (ser->tty->flow.stopped,
> ser->tty->flow.tco_stopped, ser->tty->ctrl.packet).
>
> Can this cause a NULL pointer dereference?
>
> Simon also raised a concern about potential deadlock with spinlock usage
> in v2. The v4 patch removes the spinlock approach entirely.
>
> handle_tx() is called from caif_xmit(), which is the ndo_start_xmit
> handler. This can run in softirq context (NET_TX softirq). ser_release()
> runs in process context (workqueue).
>
> How does the current synchronization between handle_tx() and
> ser_release() prevent races?
We keep an extra reference to tty->link for the lifetime of the CAIF
ser_device, until ser_release() unregisters the netdevice. We believe
this can cover all the usages in handle_tx(), but please feel free to let
us know if there are uncaught issues by us.
> --
> pw-bot: cr