Re: [git pull] vfs.git part 3

From: Al Viro
Date: Wed Jul 05 2017 - 18:38:40 EST

On Wed, Jul 05, 2017 at 02:51:43PM -0700, Linus Torvalds wrote:
> On Wed, Jul 5, 2017 at 12:14 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Christoph's fs/read_write.c series - consolidation and cleanups.
> Side note - when looking through this, it struck me how confusing that
> "int flags" argument was.
> We have a ton of "flags" in the filesystem layer, and how all the
> read/write helpers take them too, and it's really hard to see what
> kind of flags they are.
> Could we perhaps make those RWF_xyz flags have a nice bitwise type,
> and use that type in the argument list, so that not only could there
> be some sparse typechecking, but the functions that pass flags on to
> each other would automatically have a certain amount of actual
> self-documenting prototypes?
> So when you look at one of those vfs_iter_write() or whatever
> functions, you just see *what* flags the flags argument is.
> Because "int flags" really is the worst. It's the wrong type anyway
> (at least make it unsigned if it's a collection of bits), but it's
> also very ambiguous indeed when there are so many other flags that are
> often used/tested in the same functions (there's the "iter" flagsm,
> there's file->f_mode, there's just a lot of different flags going on,
> and the "int flags" is the least well documented of them all,
> particularly since 99.9% of all users just pass in zero).

Sure, makes sense - especially since it's not too widely spread yet.
A side note right back at you - POLL... stuff. I'd redone the old
"hunt the buggy ->poll() instances down" series (took about 12 hours
total), got it to the point where all remaining sparse warnings about
that type are for genuine bugs. It goes like that:

define __poll_t, annotate constants
Type is controlled by ifdef - it's unsigned int unless CHECK_POLL is
defined and a bitwise type otherwise.
->poll() methods should return __poll_t
anntotate the places where ->poll() return values go
annotate poll-related wait keys
annotate poll_table_struct ->_key
That ends all infrastructure work. Methods declarations are annotated,
instances are *not*. Due to that ifdef CHECK_POLL, normal builds, including
normal sparse builds, are unaffected; with CF=-DCHECK_POLL you get __poll_t
cris: annotate ->poll() instances
ia64: annotate ->poll() instances
mips: annotate ->poll() instances
ppc: annotate ->poll() instances
um: annotate ->poll() instances
x86: annotate ->poll() instances
block: annotate ->poll() instances
crypto: annotate ->poll() instances
acpi: annotate ->poll() instances
sound: annotate ->poll() instances
tomoyo: annotate ->poll() instances
net: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
fs: annotate ->poll() instances
media: annotate ->poll() instances
the rest of drivers/*: annotate ->poll() instances
These can be folded and split as desired - almost up to per-instance. It's
pretty much "turn unsigned int foo_poll(...) into __poll_t foo_poll(...),
turn unsigned int mask; in it into __poll_t mask;" kind of stuff. Can go
on per-subsystem basis just fine - again, normal builds are completely unaffected.

scif: annotate scif_pollepd
vhost: annotate vhost_poll
dmabuf: annotate dma_buf->active

Several drivers playing games of their own with POLL... bitmaps.

annotate fs/select.c and fs/eventpoll.c
That, of course, can move up right after the infrastructure.

<fixes for assorted bugs caught by all that>
Again, can be reordered in front of the entire queue. Some are brainos
(POLL_IN instead of POLLIN - compare the kernel definitions of those),
some are "what do you mean, no returning -E... from ->poll()?". However,
there's the shitty part - poll/epoll ABI mess. POLLWR... and POLLRDHUP
are architecture-dependent; EPOLL counterparts are not and both are parts
of ABI. Consider e.g. sparc:
#define POLLWRNORM POLLOUT [4, that is]
#define POLLWRBAND 256
#define POLLMSG 512
#define POLLREMOVE 1024
#define POLLRDHUP 2048
and compare with
#define EPOLLWRNORM 0x00000100
#define EPOLLWRBAND 0x00000200
#define EPOLLMSG 0x00000400
#define EPOLLRDHUP 0x00002000

EPOLLRDHUP is never matched. Neither is EPOLLMSG (nothing raises
POLLREMOVE, but then nothing raises POLLMSG either). EPOLLWRBAND
is not matched either (that would be POLLMSG). And EPOLLWRNORM
is matched when we raise POLLWRBAND.

sparc is the worst case in that respect; mips is somewhat better -
there we have
#define POLLWRBAND 0x0100
and everything else is default. IOW, EPOLLWRBAND is never matched
and EPOLLWRNORM is matched when we raise POLLWRBAND. Several other
architectures are like mips (m68k and even more exotic stuff).

I'm not sure what to do about that. Davem is probably in the best
position to tell...

It might be worth merging the infrastructure bits right before -rc1,
maybe this cycle, maybe the next one. It's not that hard to redo
every time, but...