Re: verify_area(...) possible problem.

Linus Torvalds (torvalds@transmeta.com)
13 Apr 1999 00:30:17 GMT


In article <Pine.LNX.3.95.990412150406.136A-100000@chaos.analogic.com>,
Richard B. Johnson <root@chaos.analogic.com> wrote:
>
>If I deliberately pass the kernel a bad pointer for an ioctl(), and
>the kernel code does a verify_area(), the return value may be zero,
>but dereferencing that pointer can cause a kernel Oops.

Indeed. You CANNOT dereference a user-space pointer. It just doesn't
work.

It may sometimes _appear_ to work, but trust me, it doesn't. There are
architectures where it never works to just dereference the pointer, and
even when it appears to work it _will_ fail for bad pointers.

The short and sweet of it is: you absolutely positively _have_ to use
get_user()/put_user()/copy_from_user()/copy_to_user() for user level
accesses. There are no if's and but's about it.

The only reason to use "verify_area()" (or rather, the current name,
which is "access_ok()") is that if you're going to do multiple user
accesses, in which case you can instead of doing

put_user(a, p);
put_user(b, p+1);
put_user(c, p+2);

(which is wrong, because it doesn't check the return value, but you get
the idea) you can do

error = access_ok(VERIFY_READ, p, 3*sizeof(*p));
if (!error) {
__put_user(a, p);
__put_user(b, p+1);
__put_user(c, p+2);
}

where the __xxx versions are faster but "unsafe" unless you have
verified the area by hand first.

Basically, you should never use "verify_area()" in any new code unless
you really know what you're doing, and look at the assembly code to know
why it works and what it does.

Pretty much all uses of "verify_area()" are for historical reasons, back
from the time when we spent some serious CPU time to check by hand every
single reference. Not only was it slow, but it was fundamentally broken
in a multithreading application where the memory maps might change in
between the verify_area() and the actual access.

These days, "access_ok()" just does some architecture-specific
range-checking (which on certain architectures can actually be a no-op),
and the _real_ permission checks are then done by the access itself.

>Snippet of code from a driver:
>
>static int device_ioctl(struct inode *inp, struct file *fp, unsigned int cmd, unsigned long arg)
>{
> long flags;
> int result;
> spin_lock_irqsave(&device_lock, flags);
> result = 0;
>
> switch(cmd)
> {
> case READ_INTR_COUNT:
> if(arg)
> {
> printk("Calling verify_area(VERIFY_WRITE, %p, %d)\n", (void *)arg, sizeof(int));
> result = verify_area(VERIFY_WRITE, (void *)arg, sizeof(int));
> printk("Result was %d\n", result);
> if(result) break;
> *((int *) arg) = info->intrs;
> }
> else
> result = -EINVAL;
> break;

The above has two _quite_ serious bugs:
- it does a user level access while holding a spinlock. Bad, bad, bad,
and can very easily result in deadlocks (imagine the user level
access taking a page fault and needing the spinlock to satisfy it)
- it does the user level access as if the user pointer was a real
kernel pointer. As explained above, it never works that way.

The sad thing is that just dereferencing the pointer _almost_ works on
the x86. It works just well enough that we've had multiple of these
kinds of bugs. Oh, well..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/