Re: [PATCH] vfs: hard-ban creating files with control characters in the name

From: Adam Borowski
Date: Mon Oct 02 2017 - 23:22:09 EST

On Tue, Oct 03, 2017 at 03:07:24AM +0100, Al Viro wrote:
> On Tue, Oct 03, 2017 at 02:50:42AM +0200, Adam Borowski wrote:
> > Anything with bytes 1-31,127 will get -EACCES.
> >
> > Especially \n is bad: instead of natural file-per-line, you need an
> > user-unfriendly feature of -print0 added to every producer and consumer;
> > a good part of users either don't know or don't feel the need to bother
> > with escaping this snowflake, thus introducing security holes.
> >
> > The rest of control characters, while not as harmful, don't have a
> > legitimate use nor have any real chance of coming from a tarball (malice
> > and fooling around excluded). No character set ever supported as a system
> > locale by glibc, and, TTBMK, by other popular Unices, includes them, thus
> > it can be assumed no foreign files have such names other than artificially.
> >
> > This goes in stark contrast with other characters proposed to be banned:
> > non-UTF8 is common, and even on my desktop's disk I found examples of all
> > of: [ ], < >, initial -, initial and final space, ?, *, .., ', ", |, &.
> > Somehow no \ anywhere. I think I have an idea why no / .
> >
> > Another debatable point is whether to -EACCES or to silently rename to an
> > escaped form such as %0A. I believe the former is better because:
> > * programs can be confused if a directory has files they didn't just write
> > * many filesystems already disallow certain characters (like invalid
> > Unicode), thus returning an error is consistent
> >
> > An example of a write-up of this issue can be found at:
> >
> That essay is full of shit, and you've even mentioned parts of that just
> above...

I used it as a list of problems, not solutions.

> NAK; you'd _still_ need proper quoting (or a shell with something resembling an
> actual syntax, rather than the "more or less what srb had ended up implementing"),
> so it doesn't really buy you anything.

Well, what about just \n then? Unlike all the others which are relatively
straightforward, \n requires -print0 which not all programs implement, and
way too many people consider too burdensome to use.

> Badly written script will still be exploitable.

Yeah, but we'd kill a major exploit avenue.

> And since older kernels and other Unices are not going away, you would've
> created an inconsistently vulnerable set of scripts, on top of the false
> sense of security.

That shouldn't stop us from improving new kernels -- scripts that have
-print0 won't lose it, those that don't will have a vulnerability fixed.
Same as with any other kind of hardening. As for other Unices: Theo de
Raadt is not someone to object to a trivial security patch, FreeBSD would
follow, OSX is too hostile to developers for me to care. Thus, the only
concern is new userland on old kernels. But distributions don't support
such combinations for long, unlike the other way around. As for people
writing their own scripts: they already tend to be vulnerable.

I for example, when writing an ad-hoc pipeline, tend to first make it
display files that'd be processed; switching that to -print0 back and forth
would be really tedious thus I usually remain vulnerable to \n (unless the
script is meant for external use -- but it's too easy to forget). And how
do you propose to process a list of files with grep or sed if there are
newlines involved?

Basic quotes make it trivial to handle everything but two snowflakes: \n and
initial -; the latter you need to remember about but ./* or -- aren't hard.
This leaves \n.

Thus, would you consider banning just newlines?

âââââââ We domesticated dogs 36000 years ago; together we chased
âââââââ animals, hung out and licked or scratched our private parts.
âââââââ Cats domesticated us 9500 years ago, and immediately we got
âââââââ agriculture, towns then cities. -- whitroth on /.