Re: [PATCH] vfs: fix statfs64() returning impossible EOVERFLOW for 64-bit f_files

From: Al Viro
Date: Thu Oct 05 2017 - 19:06:41 EST

On Thu, Oct 05, 2017 at 03:31:05PM -0700, Linus Torvalds wrote:
> On Thu, Oct 5, 2017 at 1:57 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > AFAICS, the real bug here is in hugetlbfs; that's where obscene values in
> > ->f_bsize come from. IMO all that code in put_compat_statfs64() should be
> > replaced with
> > if (kbuf->bsize != (u32)kbuf->bsize)
> > return -EOVERFLOW;
> > That, or hugetlbfs could be taught to fake saner ->f_bsize (recalculating
> > ->f_bavail/->f_bfree/->f_blocks to go with that).
> Make it so. Except you shouldn't do
> if (kbuf->bsize != (u32)kbuf->bsize)
> you should do something like
> #define FITS_IN(x,y) ({ typeof x __x = (x); typeof y __y = __x;
> __x == __y; })
> and then do
> if (!FITS_IN(kbuf->bsize, ubuf->bsize)) ...
> because there is nothing that specifies that the ubuf size of all
> fields has to be 32 bits.
> But yes,m either you need to then special-case that crazy all-ones
> value, or just fix hugetlbfs to not use crazy crap.

All-ones is not a problem at all - those two fields are consistently
64bit in struct statfs64 on all 32bit architectures. That had pretty
much been the rationale for statfs64(2) in the first place - statfs(2)
couldn't be used on large filesystems; 4Gfiles and you get an overflow
on 32bit. So the entire "let's check if f_files/f_ffree/f_bavail/f_bfree/
f_blocks fit into 32 bits" had been an utter nonsense from the very
beginning and the only reason it hadn't been spotted earlier was that
this logics was under if (sizeof(u64) == 4) until the last November.

Just to make sure we are on the same page: out of kstatfs fields
we have 5 u64 ones (see above; all of them are u64 is struct statfs64
on all architectures), an opaque 64bit f_fsid and 5 fields that
are long: f_type (magic numbers, all 32bit), f_namelen (max filename
length), f_frsize (0 on most of filesystems, always fits into 32 bits),
f_flags (guaranteed to be 32bit) and f_bsize.

f_bsize is a mess - normal practice for Unices is to have f_blocks in
units of f_frsize, leaving f_bsize as preferred IO granularity. Linux
didn't have f_frsize until 2003 or so, and f_bsize got used for units
of f_blocks.

hugetlbfs uses it to report the huge page size; the real problem
last year commit tried to deal with was that on boxen with huge pages
4Gb or bigger we get 0 observed in that field by 32bit processes
calling statfs64(2). I'm not sure whether we treat that use of
f_bsize by hugetlbfs as an accidental ABI (in that case we need to
check that it fits into u32 and fail with EOVERFLOW otherwise;
again, all compat_statfs64 have f_bsize 32bit) or just cap it with
something sane (2Gb?) and adjast f_blocks/f_bavail/f_bfree accordingly.

Fields that are u64 in kstatfs don't need any checks - they are
64bit in compat_statfs64 as well. Other four 32bit fields... sure,
we could check them, but for those the reasonable reaction is not