Re: Splicing to/from a tty

From: Linus Torvalds
Date: Tue Jan 19 2021 - 15:29:01 EST


On Tue, Jan 19, 2021 at 3:54 AM Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> This looks sane, but I'm still missing what the goal of this is here.
> It's nice from a "don't make the ldisc do the userspace copy", point of
> view, but what is the next step in order to tie that into splice?

Ok, so here's a series of four patches that make ttys possible sources
and destinations for splice() again.

Well, the first patch is just the pipe one for sendfile() - and it's
the hacky one-liner, not the proper one that Al will hopefully add.

NOTE! I've signed off on these, because I think they are fine for
testing - but they are really meant for testing ONLY.

I'm running a kernel with these in place, so they kind of work. And
yes, I verified that sendfile() now works with a pipe or tty target. I
didn't actually check splicing _from_ a tty, nor did I check that
readv/writev now works properly, but it all LoosGood(tm) to me.

HOWEVER.

The reason these are for testing only is that

(a) my tests are pretty limited, and I'd like the actual people who
reported this to really test them out

(b) the new read iterator model is going to be quite horribly slow
for big pty transfers because the n_tty ldisc isn't doing the cookie
batching

(c) I really really want Al to take a look at that iov_iter_revert()
thing in do_tty_write() (in "[PATCH 2/4] tty: implement write_iter")

Note that I'm more than happy to do (b) as a patch on top of this, but
I'd like (a) and (c) to be clarified before I do that.

> I ask as I also have reports that sysfs binary files are now failing for
> this same reason, so I need to make the same change for them and it's
> not excatly obvious what to do:

Ok, so that would require those kernfs_fop_{read,write}() functions to
also be converted to read_iter/write_iter.

That doesn't look horrendous: it's not all that dissimilar from the
two patches to do that for tty's ("tty: implement {read,write}_iter").
The seq_file part already has a iter version for reading, and the main
change to kernfs_file_direct_read() and kernfs_fop_write() is to do
that

(a) change the arguments from file/buf/count/ppos to kiocb/iov_iter

(b) change the copy_to/from_user() calls to copy_to/from_iter()

Note that (b) involves changing the semantics of the return value:
"copy_to/from_user()" returns the number of bytes that were *NOT*
copied, while "copy_to/from_iter()" returns the number of bytes that
WERE copied.

So the error case check does from

if (copy_to/from_user()) **ERROR**

to

if (copy_to/from_iter(n) != n) **ERROR**

but that is fairly straightforward.

The two "tty: implement write/read_iter" patches (patch 2 and 4) can
be used as examples. That said, I want to again stress that they
haven't seen all that much testing, and I do want Al to spray his holy
penguin pee on that iov_iter_revert() thing in patch 2.

I'm honestly not that motivated on those sysfs files: the tty layer
was an interesting test-case that I wanted to look at just because the
conversion to kernel pointers was nontrivial for the read side.

But that sysfs binary file case really isn't interesting, and just
more of a "Christoph broke it, I think he should just fix it".
Christoph?

Anyway, anybody willing to test these tty/pipe patches on the loads
that failed? Oliver?

Linus
From 95713b6e8b2247c55dd0a04174a55ea9a7fde7f6 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Tue, 19 Jan 2021 09:26:15 -0800
Subject: [PATCH 1/4] pipe: allow sendfile() destination with splice_write

Note that Al Viro is 100% right that this isn't needed for regular
splicing (that treats pipes specially, since pipes _are_ the splice
buffers).

So the correct thing to do is to teach do_splice_direct() the same "hey,
it's already a pipe", and fix sendfile() with a pipe destination that way.

But this is the one-liner "make it work" thing, rather than the "do it
properly" thing that Al will hopefully do.

Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Reported-by: Johannes Berg <johannes@xxxxxxxxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
fs/pipe.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/pipe.c b/fs/pipe.c
index c5989cfd564d..39c96845a72f 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -1206,6 +1206,7 @@ const struct file_operations pipefifo_fops = {
.unlocked_ioctl = pipe_ioctl,
.release = pipe_release,
.fasync = pipe_fasync,
+ .splice_write = iter_file_splice_write,
};

/*
--
2.29.2.157.g1d47791a39

From 0dce8c5ef15f0aa7b4525721b86a20b7c4df8ca0 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Tue, 19 Jan 2021 11:41:16 -0800
Subject: [PATCH 2/4] tty: implement write_iter

This makes the tty layer use the .write_iter() function instead of the
traditional .write() functionality.

That allows writev(), but more importantly also makes it possible to
enable .splice_write() for ttys, reinstating the "splice to tty"
functionality that was lost in commit 36e2c7421f02 ("fs: don't allow
splice read/write without explicit ops").

Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Reported-by: Oliver Giles <ohw.giles@xxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
drivers/tty/tty_io.c | 48 ++++++++++++++++++++++++--------------------
1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 8034489337d7..502862626b2b 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -143,9 +143,8 @@ LIST_HEAD(tty_drivers); /* linked list of tty drivers */
DEFINE_MUTEX(tty_mutex);

static ssize_t tty_read(struct file *, char __user *, size_t, loff_t *);
-static ssize_t tty_write(struct file *, const char __user *, size_t, loff_t *);
-ssize_t redirected_tty_write(struct file *, const char __user *,
- size_t, loff_t *);
+static ssize_t tty_write(struct kiocb *, struct iov_iter *);
+ssize_t redirected_tty_write(struct kiocb *, struct iov_iter *);
static __poll_t tty_poll(struct file *, poll_table *);
static int tty_open(struct inode *, struct file *);
long tty_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
@@ -478,7 +477,8 @@ static void tty_show_fdinfo(struct seq_file *m, struct file *file)
static const struct file_operations tty_fops = {
.llseek = no_llseek,
.read = tty_read,
- .write = tty_write,
+ .write_iter = tty_write,
+ .splice_write = iter_file_splice_write,
.poll = tty_poll,
.unlocked_ioctl = tty_ioctl,
.compat_ioctl = tty_compat_ioctl,
@@ -491,7 +491,8 @@ static const struct file_operations tty_fops = {
static const struct file_operations console_fops = {
.llseek = no_llseek,
.read = tty_read,
- .write = redirected_tty_write,
+ .write_iter = redirected_tty_write,
+ .splice_write = iter_file_splice_write,
.poll = tty_poll,
.unlocked_ioctl = tty_ioctl,
.compat_ioctl = tty_compat_ioctl,
@@ -606,9 +607,9 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
/* This breaks for file handles being sent over AF_UNIX sockets ? */
list_for_each_entry(priv, &tty->tty_files, list) {
filp = priv->file;
- if (filp->f_op->write == redirected_tty_write)
+ if (filp->f_op->write_iter == redirected_tty_write)
cons_filp = filp;
- if (filp->f_op->write != tty_write)
+ if (filp->f_op->write_iter != tty_write)
continue;
closecount++;
__tty_fasync(-1, filp, 0); /* can't block */
@@ -901,9 +902,9 @@ static inline ssize_t do_tty_write(
ssize_t (*write)(struct tty_struct *, struct file *, const unsigned char *, size_t),
struct tty_struct *tty,
struct file *file,
- const char __user *buf,
- size_t count)
+ struct iov_iter *from)
{
+ size_t count = iov_iter_count(from);
ssize_t ret, written = 0;
unsigned int chunk;

@@ -955,14 +956,20 @@ static inline ssize_t do_tty_write(
size_t size = count;
if (size > chunk)
size = chunk;
+
ret = -EFAULT;
- if (copy_from_user(tty->write_buf, buf, size))
+ if (copy_from_iter(tty->write_buf, size, from) != size)
break;
+
ret = write(tty, file, tty->write_buf, size);
if (ret <= 0)
break;
+
+ /* FIXME! Have Al check this! */
+ if (ret != size)
+ iov_iter_revert(from, size-ret);
+
written += ret;
- buf += ret;
count -= ret;
if (!count)
break;
@@ -1022,9 +1029,9 @@ void tty_write_message(struct tty_struct *tty, char *msg)
* write method will not be invoked in parallel for each device.
*/

-static ssize_t tty_write(struct file *file, const char __user *buf,
- size_t count, loff_t *ppos)
+static ssize_t tty_write(struct kiocb *iocb, struct iov_iter *from)
{
+ struct file *file = iocb->ki_filp;
struct tty_struct *tty = file_tty(file);
struct tty_ldisc *ld;
ssize_t ret;
@@ -1037,18 +1044,15 @@ static ssize_t tty_write(struct file *file, const char __user *buf,
if (tty->ops->write_room == NULL)
tty_err(tty, "missing write_room method\n");
ld = tty_ldisc_ref_wait(tty);
- if (!ld)
- return hung_up_tty_write(file, buf, count, ppos);
- if (!ld->ops->write)
+ if (!ld || !ld->ops->write)
ret = -EIO;
else
- ret = do_tty_write(ld->ops->write, tty, file, buf, count);
+ ret = do_tty_write(ld->ops->write, tty, file, from);
tty_ldisc_deref(ld);
return ret;
}

-ssize_t redirected_tty_write(struct file *file, const char __user *buf,
- size_t count, loff_t *ppos)
+ssize_t redirected_tty_write(struct kiocb *iocb, struct iov_iter *iter)
{
struct file *p = NULL;

@@ -1059,11 +1063,11 @@ ssize_t redirected_tty_write(struct file *file, const char __user *buf,

if (p) {
ssize_t res;
- res = vfs_write(p, buf, count, &p->f_pos);
+ res = vfs_iocb_iter_write(p, iocb, iter);
fput(p);
return res;
}
- return tty_write(file, buf, count, ppos);
+ return tty_write(iocb, iter);
}

/*
@@ -2295,7 +2299,7 @@ static int tioccons(struct file *file)
{
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
- if (file->f_op->write == redirected_tty_write) {
+ if (file->f_op->write_iter == redirected_tty_write) {
struct file *f;
spin_lock(&redirect_lock);
f = redirect;
--
2.29.2.157.g1d47791a39

From 8b7bacf932d1090ea87fd9ad218715055d3eb66e Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Mon, 18 Jan 2021 13:31:30 -0800
Subject: [PATCH 3/4] tty: convert tty_ldisc_ops 'read()' function to take a
kernel pointer

The tty line discipline .read() function was passed the final user
pointer destination as an argument, which doesn't match the 'write()'
function, and makes it very inconvenient to do a splice method for
tty's.

This is a conversion to use a kernel buffer instead.

NOTE! It does this by passing the tty line discipline ->read() function
an additional "cookie" to fill in, and an offset into the cookie data.

The line discipline can fill in the cookie data with its own private
information, and then the reader will repeat the read until either the
cookie is cleared or it runs out of data.

The only real user of this is N_HDLC, which can use this to handle big
packets, even if the kernel buffer is smaller than the whole packet.

Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
drivers/bluetooth/hci_ldisc.c | 34 +++++++--------
drivers/input/serio/serport.c | 4 +-
drivers/net/ppp/ppp_async.c | 3 +-
drivers/net/ppp/ppp_synctty.c | 3 +-
drivers/tty/n_gsm.c | 3 +-
drivers/tty/n_hdlc.c | 60 +++++++++++++++++--------
drivers/tty/n_null.c | 3 +-
drivers/tty/n_r3964.c | 10 ++---
drivers/tty/n_tracerouter.c | 4 +-
drivers/tty/n_tracesink.c | 4 +-
drivers/tty/n_tty.c | 82 +++++++++++++++--------------------
drivers/tty/tty_io.c | 64 +++++++++++++++++++++++++--
include/linux/tty_ldisc.h | 3 +-
net/nfc/nci/uart.c | 3 +-
14 files changed, 178 insertions(+), 102 deletions(-)

diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c
index f83d67eafc9f..dd92aea15b8b 100644
--- a/drivers/bluetooth/hci_ldisc.c
+++ b/drivers/bluetooth/hci_ldisc.c
@@ -802,7 +802,8 @@ static int hci_uart_tty_ioctl(struct tty_struct *tty, struct file *file,
* We don't provide read/write/poll interface for user space.
*/
static ssize_t hci_uart_tty_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t nr)
+ unsigned char *buf, size_t nr,
+ void **cookie, unsigned long offset)
{
return 0;
}
@@ -819,29 +820,28 @@ static __poll_t hci_uart_tty_poll(struct tty_struct *tty,
return 0;
}

+static struct tty_ldisc_ops hci_uart_ldisc = {
+ .owner = THIS_MODULE,
+ .magic = TTY_LDISC_MAGIC,
+ .name = "n_hci",
+ .open = hci_uart_tty_open,
+ .close = hci_uart_tty_close,
+ .read = hci_uart_tty_read,
+ .write = hci_uart_tty_write,
+ .ioctl = hci_uart_tty_ioctl,
+ .compat_ioctl = hci_uart_tty_ioctl,
+ .poll = hci_uart_tty_poll,
+ .receive_buf = hci_uart_tty_receive,
+ .write_wakeup = hci_uart_tty_wakeup,
+};
+
static int __init hci_uart_init(void)
{
- static struct tty_ldisc_ops hci_uart_ldisc;
int err;

BT_INFO("HCI UART driver ver %s", VERSION);

/* Register the tty discipline */
-
- memset(&hci_uart_ldisc, 0, sizeof(hci_uart_ldisc));
- hci_uart_ldisc.magic = TTY_LDISC_MAGIC;
- hci_uart_ldisc.name = "n_hci";
- hci_uart_ldisc.open = hci_uart_tty_open;
- hci_uart_ldisc.close = hci_uart_tty_close;
- hci_uart_ldisc.read = hci_uart_tty_read;
- hci_uart_ldisc.write = hci_uart_tty_write;
- hci_uart_ldisc.ioctl = hci_uart_tty_ioctl;
- hci_uart_ldisc.compat_ioctl = hci_uart_tty_ioctl;
- hci_uart_ldisc.poll = hci_uart_tty_poll;
- hci_uart_ldisc.receive_buf = hci_uart_tty_receive;
- hci_uart_ldisc.write_wakeup = hci_uart_tty_wakeup;
- hci_uart_ldisc.owner = THIS_MODULE;
-
err = tty_register_ldisc(N_HCI, &hci_uart_ldisc);
if (err) {
BT_ERR("HCI line discipline registration failed. (%d)", err);
diff --git a/drivers/input/serio/serport.c b/drivers/input/serio/serport.c
index 8ac970a423de..33e9d9bfd036 100644
--- a/drivers/input/serio/serport.c
+++ b/drivers/input/serio/serport.c
@@ -156,7 +156,9 @@ static void serport_ldisc_receive(struct tty_struct *tty, const unsigned char *c
* returning 0 characters.
*/

-static ssize_t serport_ldisc_read(struct tty_struct * tty, struct file * file, unsigned char __user * buf, size_t nr)
+static ssize_t serport_ldisc_read(struct tty_struct * tty, struct file * file,
+ unsigned char *kbuf, size_t nr,
+ void **cookie, unsigned long offset)
{
struct serport *serport = (struct serport*) tty->disc_data;
struct serio *serio;
diff --git a/drivers/net/ppp/ppp_async.c b/drivers/net/ppp/ppp_async.c
index 29a0917a81e6..f14a9d190de9 100644
--- a/drivers/net/ppp/ppp_async.c
+++ b/drivers/net/ppp/ppp_async.c
@@ -259,7 +259,8 @@ static int ppp_asynctty_hangup(struct tty_struct *tty)
*/
static ssize_t
ppp_asynctty_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t count)
+ unsigned char *buf, size_t count,
+ void **cookie, unsigned long offset)
{
return -EAGAIN;
}
diff --git a/drivers/net/ppp/ppp_synctty.c b/drivers/net/ppp/ppp_synctty.c
index 0f338752c38b..f774b7e52da4 100644
--- a/drivers/net/ppp/ppp_synctty.c
+++ b/drivers/net/ppp/ppp_synctty.c
@@ -257,7 +257,8 @@ static int ppp_sync_hangup(struct tty_struct *tty)
*/
static ssize_t
ppp_sync_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t count)
+ unsigned char *buf, size_t count,
+ void **cookie, unsigned long offset)
{
return -EAGAIN;
}
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index c676fa89ee0b..51dafc06f541 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -2559,7 +2559,8 @@ static void gsmld_write_wakeup(struct tty_struct *tty)
*/

static ssize_t gsmld_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t nr)
+ unsigned char *buf, size_t nr,
+ void **cookie, unsigned long offset)
{
return -EOPNOTSUPP;
}
diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c
index 12557ee1edb6..1363e659dc1d 100644
--- a/drivers/tty/n_hdlc.c
+++ b/drivers/tty/n_hdlc.c
@@ -416,13 +416,19 @@ static void n_hdlc_tty_receive(struct tty_struct *tty, const __u8 *data,
* Returns the number of bytes returned or error code.
*/
static ssize_t n_hdlc_tty_read(struct tty_struct *tty, struct file *file,
- __u8 __user *buf, size_t nr)
+ __u8 *kbuf, size_t nr,
+ void **cookie, unsigned long offset)
{
struct n_hdlc *n_hdlc = tty->disc_data;
int ret = 0;
struct n_hdlc_buf *rbuf;
DECLARE_WAITQUEUE(wait, current);

+ /* Is this a repeated call for an rbuf we already found earlier? */
+ rbuf = *cookie;
+ if (rbuf)
+ goto have_rbuf;
+
add_wait_queue(&tty->read_wait, &wait);

for (;;) {
@@ -436,25 +442,8 @@ static ssize_t n_hdlc_tty_read(struct tty_struct *tty, struct file *file,
set_current_state(TASK_INTERRUPTIBLE);

rbuf = n_hdlc_buf_get(&n_hdlc->rx_buf_list);
- if (rbuf) {
- if (rbuf->count > nr) {
- /* too large for caller's buffer */
- ret = -EOVERFLOW;
- } else {
- __set_current_state(TASK_RUNNING);
- if (copy_to_user(buf, rbuf->buf, rbuf->count))
- ret = -EFAULT;
- else
- ret = rbuf->count;
- }
-
- if (n_hdlc->rx_free_buf_list.count >
- DEFAULT_RX_BUF_COUNT)
- kfree(rbuf);
- else
- n_hdlc_buf_put(&n_hdlc->rx_free_buf_list, rbuf);
+ if (rbuf)
break;
- }

/* no data */
if (tty_io_nonblock(tty, file)) {
@@ -473,6 +462,39 @@ static ssize_t n_hdlc_tty_read(struct tty_struct *tty, struct file *file,
remove_wait_queue(&tty->read_wait, &wait);
__set_current_state(TASK_RUNNING);

+ if (!rbuf)
+ return ret;
+ *cookie = rbuf;
+
+have_rbuf:
+ /* Have we used it up entirely? */
+ if (offset >= rbuf->count)
+ goto done_with_rbuf;
+
+ /* More data to go, but can't copy any more? EOVERFLOW */
+ ret = -EOVERFLOW;
+ if (!nr)
+ goto done_with_rbuf;
+
+ /* Copy as much data as possible */
+ ret = rbuf->count - offset;
+ if (ret > nr)
+ ret = nr;
+ memcpy(kbuf, rbuf->buf+offset, ret);
+ offset += ret;
+
+ /* If we still have data left, we leave the rbuf in the cookie */
+ if (offset < rbuf->count)
+ return ret;
+
+done_with_rbuf:
+ *cookie = NULL;
+
+ if (n_hdlc->rx_free_buf_list.count > DEFAULT_RX_BUF_COUNT)
+ kfree(rbuf);
+ else
+ n_hdlc_buf_put(&n_hdlc->rx_free_buf_list, rbuf);
+
return ret;

} /* end of n_hdlc_tty_read() */
diff --git a/drivers/tty/n_null.c b/drivers/tty/n_null.c
index 96feabae4740..ce03ae78f5c6 100644
--- a/drivers/tty/n_null.c
+++ b/drivers/tty/n_null.c
@@ -20,7 +20,8 @@ static void n_null_close(struct tty_struct *tty)
}

static ssize_t n_null_read(struct tty_struct *tty, struct file *file,
- unsigned char __user * buf, size_t nr)
+ unsigned char *buf, size_t nr,
+ void **cookie, unsigned long offset)
{
return -EOPNOTSUPP;
}
diff --git a/drivers/tty/n_r3964.c b/drivers/tty/n_r3964.c
index 934dd2fb2ec8..3161f0a535e3 100644
--- a/drivers/tty/n_r3964.c
+++ b/drivers/tty/n_r3964.c
@@ -129,7 +129,7 @@ static void remove_client_block(struct r3964_info *pInfo,
static int r3964_open(struct tty_struct *tty);
static void r3964_close(struct tty_struct *tty);
static ssize_t r3964_read(struct tty_struct *tty, struct file *file,
- unsigned char __user * buf, size_t nr);
+ void *cookie, unsigned char *buf, size_t nr);
static ssize_t r3964_write(struct tty_struct *tty, struct file *file,
const unsigned char *buf, size_t nr);
static int r3964_ioctl(struct tty_struct *tty, struct file *file,
@@ -1058,7 +1058,8 @@ static void r3964_close(struct tty_struct *tty)
}

static ssize_t r3964_read(struct tty_struct *tty, struct file *file,
- unsigned char __user * buf, size_t nr)
+ unsigned char *kbuf, size_t nr,
+ void **cookie, unsigned long offset)
{
struct r3964_info *pInfo = tty->disc_data;
struct r3964_client_info *pClient;
@@ -1109,10 +1110,7 @@ static ssize_t r3964_read(struct tty_struct *tty, struct file *file,
kfree(pMsg);
TRACE_M("r3964_read - msg kfree %p", pMsg);

- if (copy_to_user(buf, &theMsg, ret)) {
- ret = -EFAULT;
- goto unlock;
- }
+ memcpy(kbuf, &theMsg, ret);

TRACE_PS("read - return %d", ret);
goto unlock;
diff --git a/drivers/tty/n_tracerouter.c b/drivers/tty/n_tracerouter.c
index 4479af4d2fa5..3490ed51b1a3 100644
--- a/drivers/tty/n_tracerouter.c
+++ b/drivers/tty/n_tracerouter.c
@@ -118,7 +118,9 @@ static void n_tracerouter_close(struct tty_struct *tty)
* -EINVAL
*/
static ssize_t n_tracerouter_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t nr) {
+ unsigned char *buf, size_t nr,
+ void **cookie, unsigned long offset)
+{
return -EINVAL;
}

diff --git a/drivers/tty/n_tracesink.c b/drivers/tty/n_tracesink.c
index d96ba82cc356..1d9931041fd8 100644
--- a/drivers/tty/n_tracesink.c
+++ b/drivers/tty/n_tracesink.c
@@ -115,7 +115,9 @@ static void n_tracesink_close(struct tty_struct *tty)
* -EINVAL
*/
static ssize_t n_tracesink_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t nr) {
+ unsigned char *buf, size_t nr,
+ void **cookie, unsigned long offset)
+{
return -EINVAL;
}

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 319d68c8a5df..2f2f57a53968 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -164,29 +164,24 @@ static void zero_buffer(struct tty_struct *tty, u8 *buffer, int size)
memset(buffer, 0x00, size);
}

-static int tty_copy_to_user(struct tty_struct *tty, void __user *to,
- size_t tail, size_t n)
+static void tty_copy(struct tty_struct *tty, void *to, size_t tail, size_t n)
{
struct n_tty_data *ldata = tty->disc_data;
size_t size = N_TTY_BUF_SIZE - tail;
void *from = read_buf_addr(ldata, tail);
- int uncopied;

if (n > size) {
tty_audit_add_data(tty, from, size);
- uncopied = copy_to_user(to, from, size);
- zero_buffer(tty, from, size - uncopied);
- if (uncopied)
- return uncopied;
+ memcpy(to, from, size);
+ zero_buffer(tty, from, size);
to += size;
n -= size;
from = ldata->read_buf;
}

tty_audit_add_data(tty, from, n);
- uncopied = copy_to_user(to, from, n);
- zero_buffer(tty, from, n - uncopied);
- return uncopied;
+ memcpy(to, from, n);
+ zero_buffer(tty, from, n);
}

/**
@@ -1944,15 +1939,16 @@ static inline int input_available_p(struct tty_struct *tty, int poll)
/**
* copy_from_read_buf - copy read data directly
* @tty: terminal device
- * @b: user data
+ * @kbp: data
* @nr: size of data
*
* Helper function to speed up n_tty_read. It is only called when
- * ICANON is off; it copies characters straight from the tty queue to
- * user space directly. It can be profitably called twice; once to
- * drain the space from the tail pointer to the (physical) end of the
- * buffer, and once to drain the space from the (physical) beginning of
- * the buffer to head pointer.
+ * ICANON is off; it copies characters straight from the tty queue.
+ *
+ * It can be profitably called twice; once to drain the space from
+ * the tail pointer to the (physical) end of the buffer, and once
+ * to drain the space from the (physical) beginning of the buffer
+ * to head pointer.
*
* Called under the ldata->atomic_read_lock sem
*
@@ -1962,7 +1958,7 @@ static inline int input_available_p(struct tty_struct *tty, int poll)
*/

static int copy_from_read_buf(struct tty_struct *tty,
- unsigned char __user **b,
+ unsigned char **kbp,
size_t *nr)

{
@@ -1978,8 +1974,7 @@ static int copy_from_read_buf(struct tty_struct *tty,
n = min(*nr, n);
if (n) {
unsigned char *from = read_buf_addr(ldata, tail);
- retval = copy_to_user(*b, from, n);
- n -= retval;
+ memcpy(*kbp, from, n);
is_eof = n == 1 && *from == EOF_CHAR(tty);
tty_audit_add_data(tty, from, n);
zero_buffer(tty, from, n);
@@ -1988,7 +1983,7 @@ static int copy_from_read_buf(struct tty_struct *tty,
if (L_EXTPROC(tty) && ldata->icanon && is_eof &&
(head == ldata->read_tail))
n = 0;
- *b += n;
+ *kbp += n;
*nr -= n;
}
return retval;
@@ -1997,12 +1992,12 @@ static int copy_from_read_buf(struct tty_struct *tty,
/**
* canon_copy_from_read_buf - copy read data in canonical mode
* @tty: terminal device
- * @b: user data
+ * @kbp: data
* @nr: size of data
*
* Helper function for n_tty_read. It is only called when ICANON is on;
* it copies one line of input up to and including the line-delimiting
- * character into the user-space buffer.
+ * character into the result buffer.
*
* NB: When termios is changed from non-canonical to canonical mode and
* the read buffer contains data, n_tty_set_termios() simulates an EOF
@@ -2018,14 +2013,14 @@ static int copy_from_read_buf(struct tty_struct *tty,
*/

static int canon_copy_from_read_buf(struct tty_struct *tty,
- unsigned char __user **b,
+ unsigned char **kbp,
size_t *nr)
{
struct n_tty_data *ldata = tty->disc_data;
size_t n, size, more, c;
size_t eol;
size_t tail;
- int ret, found = 0;
+ int found = 0;

/* N.B. avoid overrun if nr == 0 */
if (!*nr)
@@ -2061,10 +2056,8 @@ static int canon_copy_from_read_buf(struct tty_struct *tty,
n_tty_trace("%s: eol:%zu found:%d n:%zu c:%zu tail:%zu more:%zu\n",
__func__, eol, found, n, c, tail, more);

- ret = tty_copy_to_user(tty, *b, tail, n);
- if (ret)
- return -EFAULT;
- *b += n;
+ tty_copy(tty, *kbp, tail, n);
+ *kbp += n;
*nr -= n;

if (found)
@@ -2132,10 +2125,11 @@ static int job_control(struct tty_struct *tty, struct file *file)
*/

static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t nr)
+ unsigned char *kbuf, size_t nr,
+ void **cookie, unsigned long offset)
{
struct n_tty_data *ldata = tty->disc_data;
- unsigned char __user *b = buf;
+ unsigned char *kb = kbuf;
DEFINE_WAIT_FUNC(wait, woken_wake_function);
int c;
int minimum, time;
@@ -2181,17 +2175,13 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
/* First test for status change. */
if (packet && tty->link->ctrl_status) {
unsigned char cs;
- if (b != buf)
+ if (kb != kbuf)
break;
spin_lock_irq(&tty->link->ctrl_lock);
cs = tty->link->ctrl_status;
tty->link->ctrl_status = 0;
spin_unlock_irq(&tty->link->ctrl_lock);
- if (put_user(cs, b)) {
- retval = -EFAULT;
- break;
- }
- b++;
+ *kb++ = cs;
nr--;
break;
}
@@ -2234,24 +2224,20 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
}

if (ldata->icanon && !L_EXTPROC(tty)) {
- retval = canon_copy_from_read_buf(tty, &b, &nr);
+ retval = canon_copy_from_read_buf(tty, &kb, &nr);
if (retval)
break;
} else {
int uncopied;

/* Deal with packet mode. */
- if (packet && b == buf) {
- if (put_user(TIOCPKT_DATA, b)) {
- retval = -EFAULT;
- break;
- }
- b++;
+ if (packet && kb == kbuf) {
+ *kb++ = TIOCPKT_DATA;
nr--;
}

- uncopied = copy_from_read_buf(tty, &b, &nr);
- uncopied += copy_from_read_buf(tty, &b, &nr);
+ uncopied = copy_from_read_buf(tty, &kb, &nr);
+ uncopied += copy_from_read_buf(tty, &kb, &nr);
if (uncopied) {
retval = -EFAULT;
break;
@@ -2260,7 +2246,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,

n_tty_check_unthrottle(tty);

- if (b - buf >= minimum)
+ if (kb - kbuf >= minimum)
break;
if (time)
timeout = time;
@@ -2272,8 +2258,8 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
remove_wait_queue(&tty->read_wait, &wait);
mutex_unlock(&ldata->atomic_read_lock);

- if (b - buf)
- retval = b - buf;
+ if (kb - kbuf)
+ retval = kb - kbuf;

return retval;
}
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 502862626b2b..d33e120046a6 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -832,6 +832,65 @@ static void tty_update_time(struct timespec64 *time)
time->tv_sec = sec;
}

+/*
+ * Iterate on the ldisc ->read() function until we've gotten all
+ * the data the ldisc has for us.
+ *
+ * The "cookie" is something that the ldisc read function can fill
+ * in to let us know that there is more data to be had.
+ *
+ * We promise to continue to call the ldisc until it stops returning
+ * data or clears the cookie. The cookie may be something that the
+ * ldisc maintains state for and needs to free.
+ */
+static int iterate_tty_read(struct tty_ldisc *ld, struct tty_struct *tty, struct file *file,
+ char __user *buf, size_t count)
+{
+ int retval = 0;
+ void *cookie = NULL;
+ unsigned long offset = 0;
+ char kernel_buf[64];
+
+ do {
+ int size, uncopied;
+
+ size = count > sizeof(kernel_buf) ? sizeof(kernel_buf) : count;
+ size = ld->ops->read(tty, file, kernel_buf, size, &cookie, offset);
+ if (!size)
+ break;
+
+ /*
+ * A ldisc read error return will override any previously copied
+ * data (eg -EOVERFLOW from HDLC)
+ */
+ if (size < 0) {
+ memzero_explicit(kernel_buf, sizeof(kernel_buf));
+ return size;
+ }
+
+ uncopied = copy_to_user(buf+offset, kernel_buf, size);
+ size -= uncopied;
+ offset += size;
+ count -= size;
+
+ /*
+ * If the user copy failed, we still need to do another ->read()
+ * call if we had a cookie to let the ldisc clear up.
+ *
+ * But make sure size is zeroed.
+ */
+ if (unlikely(uncopied)) {
+ count = 0;
+ retval = -EFAULT;
+ }
+ } while (cookie);
+
+ /* We always clear tty buffer in case they contained passwords */
+ memzero_explicit(kernel_buf, sizeof(kernel_buf));
+ return offset ? offset : retval;
+}
+
+
/**
* tty_read - read method for tty device files
* @file: pointer to tty file
@@ -865,10 +924,9 @@ static ssize_t tty_read(struct file *file, char __user *buf, size_t count,
ld = tty_ldisc_ref_wait(tty);
if (!ld)
return hung_up_tty_read(file, buf, count, ppos);
+ i = -EIO;
if (ld->ops->read)
- i = ld->ops->read(tty, file, buf, count);
- else
- i = -EIO;
+ i = iterate_tty_read(ld, tty, file, buf, count);
tty_ldisc_deref(ld);

if (i > 0)
diff --git a/include/linux/tty_ldisc.h b/include/linux/tty_ldisc.h
index b1e6043e9917..572a07976116 100644
--- a/include/linux/tty_ldisc.h
+++ b/include/linux/tty_ldisc.h
@@ -185,7 +185,8 @@ struct tty_ldisc_ops {
void (*close)(struct tty_struct *);
void (*flush_buffer)(struct tty_struct *tty);
ssize_t (*read)(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t nr);
+ unsigned char *buf, size_t nr,
+ void **cookie, unsigned long offset);
ssize_t (*write)(struct tty_struct *tty, struct file *file,
const unsigned char *buf, size_t nr);
int (*ioctl)(struct tty_struct *tty, struct file *file,
diff --git a/net/nfc/nci/uart.c b/net/nfc/nci/uart.c
index 11b554ce07ff..1204c438e87d 100644
--- a/net/nfc/nci/uart.c
+++ b/net/nfc/nci/uart.c
@@ -292,7 +292,8 @@ static int nci_uart_tty_ioctl(struct tty_struct *tty, struct file *file,

/* We don't provide read/write/poll interface for user space. */
static ssize_t nci_uart_tty_read(struct tty_struct *tty, struct file *file,
- unsigned char __user *buf, size_t nr)
+ unsigned char *buf, size_t nr,
+ void **cookie, unsigned long offset)
{
return 0;
}
--
2.29.2.157.g1d47791a39

From 08cb81c888e88b152f49ad2c90146b8f0c9ce6b3 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Tue, 19 Jan 2021 10:49:19 -0800
Subject: [PATCH 4/4] tty: implement read_iter

Now that the ldisc read() function takes kernel pointers, it's fairly
straightforward to make the tty file operations use .read_iter() instead
of .read().

That automatically gives us vread() and friends, and also makes it
possible to do .splice_read() on tty's again.

Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Reported-by: Oliver Giles <ohw.giles@xxxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
drivers/tty/tty_io.c | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index d33e120046a6..b8c0b40f3298 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -142,7 +142,7 @@ LIST_HEAD(tty_drivers); /* linked list of tty drivers */
/* Mutex to protect creating and releasing a tty */
DEFINE_MUTEX(tty_mutex);

-static ssize_t tty_read(struct file *, char __user *, size_t, loff_t *);
+static ssize_t tty_read(struct kiocb *, struct iov_iter *);
static ssize_t tty_write(struct kiocb *, struct iov_iter *);
ssize_t redirected_tty_write(struct kiocb *, struct iov_iter *);
static __poll_t tty_poll(struct file *, poll_table *);
@@ -476,8 +476,9 @@ static void tty_show_fdinfo(struct seq_file *m, struct file *file)

static const struct file_operations tty_fops = {
.llseek = no_llseek,
- .read = tty_read,
+ .read_iter = tty_read,
.write_iter = tty_write,
+ .splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.poll = tty_poll,
.unlocked_ioctl = tty_ioctl,
@@ -490,8 +491,9 @@ static const struct file_operations tty_fops = {

static const struct file_operations console_fops = {
.llseek = no_llseek,
- .read = tty_read,
+ .read_iter = tty_read,
.write_iter = redirected_tty_write,
+ .splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.poll = tty_poll,
.unlocked_ioctl = tty_ioctl,
@@ -843,16 +845,17 @@ static void tty_update_time(struct timespec64 *time)
* data or clears the cookie. The cookie may be something that the
* ldisc maintains state for and needs to free.
*/
-static int iterate_tty_read(struct tty_ldisc *ld, struct tty_struct *tty, struct file *file,
- char __user *buf, size_t count)
+static int iterate_tty_read(struct tty_ldisc *ld, struct tty_struct *tty,
+ struct file *file, struct iov_iter *to)
{
int retval = 0;
void *cookie = NULL;
unsigned long offset = 0;
char kernel_buf[64];
+ size_t count = iov_iter_count(to);

do {
- int size, uncopied;
+ int size, copied;

size = count > sizeof(kernel_buf) ? sizeof(kernel_buf) : count;
size = ld->ops->read(tty, file, kernel_buf, size, &cookie, offset);
@@ -868,10 +871,9 @@ static int iterate_tty_read(struct tty_ldisc *ld, struct tty_struct *tty, struct
return size;
}

- uncopied = copy_to_user(buf+offset, kernel_buf, size);
- size -= uncopied;
- offset += size;
- count -= size;
+ copied = copy_to_iter(kernel_buf, size, to);
+ offset += copied;
+ count -= copied;

/*
* If the user copy failed, we still need to do another ->read()
@@ -879,7 +881,7 @@ static int iterate_tty_read(struct tty_ldisc *ld, struct tty_struct *tty, struct
*
* But make sure size is zeroed.
*/
- if (unlikely(uncopied)) {
+ if (unlikely(copied != size)) {
count = 0;
retval = -EFAULT;
}
@@ -906,10 +908,10 @@ static int iterate_tty_read(struct tty_ldisc *ld, struct tty_struct *tty, struct
* read calls may be outstanding in parallel.
*/

-static ssize_t tty_read(struct file *file, char __user *buf, size_t count,
- loff_t *ppos)
+static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
{
int i;
+ struct file *file = iocb->ki_filp;
struct inode *inode = file_inode(file);
struct tty_struct *tty = file_tty(file);
struct tty_ldisc *ld;
@@ -922,11 +924,9 @@ static ssize_t tty_read(struct file *file, char __user *buf, size_t count,
/* We want to wait for the line discipline to sort out in this
situation */
ld = tty_ldisc_ref_wait(tty);
- if (!ld)
- return hung_up_tty_read(file, buf, count, ppos);
i = -EIO;
- if (ld->ops->read)
- i = iterate_tty_read(ld, tty, file, buf, count);
+ if (ld && ld->ops->read)
+ i = iterate_tty_read(ld, tty, file, to);
tty_ldisc_deref(ld);

if (i > 0)
@@ -2929,7 +2929,7 @@ static long tty_compat_ioctl(struct file *file, unsigned int cmd,

static int this_tty(const void *t, struct file *file, unsigned fd)
{
- if (likely(file->f_op->read != tty_read))
+ if (likely(file->f_op->read_iter != tty_read))
return 0;
return file_tty(file) != t ? 0 : fd + 1;
}
--
2.29.2.157.g1d47791a39