Re: sudo x86info -a => kernel BUG at mm/usercopy.c:78!

From: Linus Torvalds
Date: Tue Apr 04 2017 - 18:55:25 EST


On Tue, Apr 4, 2017 at 3:37 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> For one of my systems, I see something like this:
>
> 00000000-00000fff : reserved
> 00001000-0008efff : System RAM
> 0008f000-0008ffff : reserved
> 00090000-0009f7ff : System RAM
> 0009f800-0009ffff : reserved

That's fairly normal.

> I note that there are two "System RAM" areas below 0x100000.

Yes. Traditionally the area from about 4k to 640kB is RAM. With a
random smattering of BIOS areas.

> * On x86, access has to be given to the first megabyte of ram because that area
> * contains BIOS code and data regions used by X and dosemu and similar apps.

Rigth. Traditionally, dosemu did one big mmap of the 1MB area to just
get all the BIOS data in one go.

> This means that it allows reads into even System RAM below 0x100000,
> but I think that's a mistake.

What you think is a "mistake" is how /dev/mem has always worked.

/dev/mem gave access to all the memory of the system. That's LITERALLY
the whole point of it. There was no "BIOS area" or anything else. It
was access to physical memory.

We've added limits to it, but those limits came later, and they came
with the caveat that lots of programs used /dev/mem in various ways.

Nobody was crazy enough to read /dev/mem one byte at a time trying to
follow BIOS tables. No, the traditional way was to just map (or read)
large chunks of it, and then follow the tables in the result. The
easiest way was to just do the whole low 1MB.

There's no "mistake" here. The only thing that is mistaken is you
thinking that we can redefine reality and change history.

I already explained what the likely fix is: make devmem_is_allowed()
return a ternary value, so that those things that *do* read the BIOS
area can just continue to do so, but they see zeroes for the parts
that the kernel has taken over.

Linus