Re: fs: out of bounds on stack in iov_iter_advance

From: Al Viro
Date: Thu Nov 05 2015 - 20:34:13 EST


On Wed, Sep 30, 2015 at 05:30:17PM -0400, Sasha Levin wrote:

> > So I've traced this all the way back to dax_io(). I can trigger this with:
> >
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 93bf2f9..2cdb8a5 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -178,6 +178,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
> > if (need_wmb)
> > wmb_pmem();
> >
> > + WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
> > return (pos == start) ? retval : pos - start;
> > }
> >
> > So it seems that iter gets moved twice here: once in dax_io(), and once again
> > back at generic_file_read_iter().
> >
> > I don't see how it ever worked. Am I missing something?

This:
struct iov_iter data = *iter;
retval = mapping->a_ops->direct_IO(iocb, &data, pos);
}

if (retval > 0) {
*ppos = pos + retval;
iov_iter_advance(iter, retval);

The iterator advanced in ->direct_IO() is a _copy_, not the original.
The contents of *iter as seen by generic_file_read_iter() is not
modifiable by ->direct_IO(), simply because its address is nowhere to
be found. And checking iov_iter_count(iter) at the end of dax_io() is
pointless - from the POV of generic_file_read_iter() it's data.count,
and while it used to be equal to iter->count, it's already been modified.
By the time we call iov_iter_advance() in generic_file_read_iter() that
value will be already discarded, along with rest of struct iov_iter data.

Wait a minute - you are triggering _what_???
> > + WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
With '&&'? iov_iter_count() is size_t, while pos and start are loff_t,
so you are seeing equal values in pos and start (as integers) *and*
(loff_t)0 > (size_t)something. loff_t is a signed type, size_t - unsigned.
6.3.1.8[1] says that
* if rank of size_t is greater or equal to rank of loff_t, the
latter gets converted to size_t. And conversion of zero should be zero,
i.e. (size_t) 0 > (size_t)something, which is impossible (we compare them
as non-negative integers).
* if loff_t can represent all values of size_t, size_t value gets
converted to loff_t. Result of conversion should have the same (in particular,
non-negative) value. Again, comparison can't be true.
* otherwise both values are converted to unsigned counterpart of
loff_t. Again, zero converts to 0 and in any unsigned type 0 > x is
impossible.

I don't see any way for that condition to evaluate true.

Assuming that it's a misquoted ||... I don't see any way for pos to
get greater than start + original iov_iter_count(). However, I *do*
see a way for bad things to happen in a different way. Look:
// first pass through the loop, pos == start (and so's max)
retval = dax_get_addr(bh, &addr, blkbits);
// got a positive value
if (retval < 0)
break;
// nope, keep going
if (buffer_unwritten(bh) || buffer_new(bh)) {
dax_new_buf(addr, retval, first, pos,
end);
need_wmb = true;
}
addr += first;
size = retval - first;
// OK...
}
max = min(pos + size, end);
// OK...
}

if (iov_iter_rw(iter) == WRITE) {
len = copy_from_iter_pmem(addr, max - pos, iter);
need_wmb = true;
} else if (!hole)
len = copy_to_iter((void __force *)addr, max - pos,
iter);
else
len = iov_iter_zero(max - pos, iter);
// too bad - we'd hit an unmapped memory area. len is 0...
// and retval is fucking positive.
if (!len)
break;

return (pos == start) ? retval : pos - start;
// will return a bloody big positive value

Could you try to reproduce it with this:

dax_io(): don't let non-error value escape via retval instead of EFAULT

Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
---
diff --git a/fs/dax.c b/fs/dax.c
index a86d3cc..7b653e9 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
else
len = iov_iter_zero(max - pos, iter);

- if (!len)
+ if (!len) {
+ retval = -EFAULT;
break;
+ }

pos += len;
addr += len;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/