Re: [PATCH v2] tty: tty_io: remove hung_up_tty_fops

From: Greg Kroah-Hartman
Date: Tue May 30 2023 - 08:52:21 EST


On Tue, May 30, 2023 at 08:57:42PM +0900, Tetsuo Handa wrote:
> On 2023/05/30 19:44, Greg Kroah-Hartman wrote:
> > On Sun, May 14, 2023 at 10:02:26AM +0900, Tetsuo Handa wrote:
> >> If we care about only NULL pointer dereference, implementing missing
> >> callbacks to hung_up_tty_fops is fine. But if we also care about KCSAN
> >> reports, we will need to wrap all filp->f_op usages which are reachable
> >> via tty_fops callbacks using data_race().
> >
> > I'm missing something here. Why would KCSAN report problems if we
> > implement the needed callbacks in hung_up_tty_fops? And what reports
> > would they be?
>
> Unlike atomic operations such as atomic_read()/atomic_set(), normal read/write
> operations are not atomic for KCSAN. KCSAN reports some value being changed
> during a read/write.
>
> In this report, KCSAN detected that __tty_hangup() changed the value of
> filp->f_op from 0xffffffff84e91ed0 to 0xffffffff84e91dc0 at
>
> filp->f_op = &hung_up_tty_fops;
>
> line when __fput() was reading the value of filp->f_op at
>
> if (file->f_op->release)
>
> line.
>
> Even if we implement the needed callbacks in hung_up_tty_fops,
> KCSAN will continue reporting that the value of filp->f_op changes.

That sounds like a bug in KCSAN, let's not add loads of infrastructure
just because we have bad tools.

> > And why would data_race() help here?
>
> data_race() tells KCSAN not to report.
> data_race() is used when the race KCSAN checks is harmless.

Again, document it, and also perhaps, not use KCSAN? :)

> >> @@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
> >> {
> >> struct tty_file_private *priv;
> >>
> >> - priv = kmalloc(sizeof(*priv), GFP_KERNEL);
> >> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> >
> > Why is this zeroing out everything now? Just because you added one
> > bool? Why not just set the bool properly instead?
>
> Because I consider that this function is not performance critical where
> avoid increasing code size by zeroing out everything is acceptable.

It happens on open() which yes, is not performance critical, but you are
now requiring it where before this was not required. Which isn't always
so obvious, right?

> >> @@ -911,6 +903,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
> >> struct tty_struct *tty = file_tty(file);
> >> struct tty_ldisc *ld;
> >>
> >> + if (tty_hung_up_p(file))
> >> + return hung_up_tty_read(iocb, to);
> >
> > What happens if you hang up _right_ after this check? There's no
> > locking here, right? Same everywhere else you have this pattern, you
> > made the race window smaller, but it's still there from what I can see.
>
> We cannot close the race window without introducing locking,
> but we don't need to close the race window.
>
> The race KCSAN found in this report is harmless, as long as callbacks
> reachable via filp->f_op does not disappear.

Which we can fix. So let's fix that and then not worry about these
false-positives with KCSAN as it's obviously wrong. That would make for
a much smaller and simpler and easier-to-maintain-over-time change.

Please do that instead.

> This patch prevents filp->f_op from suddenly disappearing callbacks,
> by not changing the value of filp->f_op.
>
>
>
> >> @@ -255,6 +255,7 @@ struct tty_file_private {
> >> struct tty_struct *tty;
> >> struct file *file;
> >> struct list_head list;
> >> + bool hung;
> >
> > No hint as to what "hung" means here?
>
> Whether __tty_hangup() was called or not.

How will you know this in 5 years when you see this new field?
Documentation matters.

thanks,

greg k-h