Re: [PATCH v3] err.h: allow IS_ERR_VALUE to handle properly more types

From: Al Viro
Date: Thu Feb 11 2016 - 16:14:36 EST


On Thu, Feb 11, 2016 at 08:00:54AM +0100, Andrzej Hajda wrote:

> This way you will limit it only to unsigned long type, which seems too
> strict to me.
> I think the macro should accept all long enough unsigned types, otherwise we
> could end up with bunch of macros IS_ERR_VALUE_U32, IS_ERR_VALUE_ULL...

... or use it a whole lot less. Which is a Good Thing(tm), because it
corrects the damage caused by seriously flawed concept of "it's a bad
value, mmkay?" kind of predicate that could be used in all situations,
without having to look at the calling conventions of the functions
used to produce the value. It doesn't work and it has already spread
around too much.

For starters, it conflates "0 on success, -E... on error" with "non-zero
on success, 0 on failure" with "-E... on error, 0 or positive on success"
with "positive on success, non-positive on error" with "address of some
object or -E..., 0 to be treated as -ENOMEM", etc.

*All* of those are present in the kernel. Promoting blind use of magic
macro will keep causing bugs, and extra degree of polymorphism will only
encourage such blind use.

Look at the actual users. IS_ERR() aside (and don't get me started on
the abortion that is IS_ERR_OR_NULL()), there is one more or less common
legitimate use. Treatement of vm_mmap()/do_mmap_pgoff()/etc. return values.
The rest is very mixed bag. To pick a random one (in net/9p/client.c):
ename = NULL;
err = p9pdu_readf(req->rc, c->proto_version, "s?d",
&ename, &ecode);
if (err)
goto out_err;

if (p9_is_proto_dotu(c))
err = -ecode;

if (!err || !IS_ERR_VALUE(err)) {
err = p9_errstr2errno(ename, strlen(ename));

p9_debug(P9_DEBUG_9P, "<<< RERROR (%d) %s\n",
-ecode, ename);
}
kfree(ename);

What's going on here? We have an RERROR or RLERROR reply coming from server.
We want to convert that to kernel-recognizable error value. Normal 9P
uses strings for errors; kernel uses numbers. String is represented as
16bit little-endian length + that many characters; that's all that plain
9P puts into RERROR payload. 9P.U (unix extensions) appends suggested 32bit
little-endian numeric value after that. 9P.L simply puts the 32bit l-e
numeric value, with no strings involved.

The quoted code deals with 9P and 9P.U. Response is parsed (FWIW, '?' in
format is "ignore the rest for plain 9P"), any failure is returned as
an error in its own right, then for 9P.U we look at the numeric value
and if it's from 1 to MAX_ERRNO we just accept that. Otherwise (plain
9P or a strange numeric value in 9P.U) we look at the string part and
convert it to E... For 9P.L we simply accept the error value given.

Incidentally, that use of numeric values assumes that all relevant error
values will have the same encoding on all architectures - 9P is a network
protocol, after all. <checks the list of error values used> Uh-oh...
{"Resource temporarily unavailable", EAGAIN},
{"exclusive use file already open", EAGAIN},
{"file is in use", EAGAIN},
... and while the normal value of EAGAIN is 11, on alpha it's 35. Oops...
It also contains a lot more values than arch-consistent subset in errno-base.h
- there are only 34 in the latter (and some architectures redefine some of
those, like alpha does to EAGAIN) and there's more than twice as much in the
former... Some of those are fairly unlikely to be generated by server, but
e.g. ENAMETOOLONG (63 on alpha and sparc, 78 on mips, 248 on parisc, 36 on
everything else) is not impossible for a fileserver at all. Looks like
trouble...

Further investigation belongs on 9P list, but in this case a blind use of
IS_ERR_VALUE() has turned out to be an indication of a bug. QED. Any such
places need careful review, and no amount of magic will avoid the need to
understand the surrounding code. I'd be seriously tempted to add "uses
IS_ERR_VALUE outside of few known-good places" to checkpatch.pl, if not
for the fact that inevitable flood of mindless "fixes" is precisely what
we do *NOT* need in this case...