Re: sudo x86info -a => kernel BUG at mm/usercopy.c:78!

From: Kees Cook
Date: Tue Apr 04 2017 - 19:00:00 EST


On Tue, Apr 4, 2017 at 3:55 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Apr 4, 2017 at 3:37 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>
>> For one of my systems, I see something like this:
>>
>> 00000000-00000fff : reserved
>> 00001000-0008efff : System RAM
>> 0008f000-0008ffff : reserved
>> 00090000-0009f7ff : System RAM
>> 0009f800-0009ffff : reserved
>
> That's fairly normal.
>
>> I note that there are two "System RAM" areas below 0x100000.
>
> Yes. Traditionally the area from about 4k to 640kB is RAM. With a
> random smattering of BIOS areas.
>
>> * On x86, access has to be given to the first megabyte of ram because that area
>> * contains BIOS code and data regions used by X and dosemu and similar apps.
>
> Rigth. Traditionally, dosemu did one big mmap of the 1MB area to just
> get all the BIOS data in one go.
>
>> This means that it allows reads into even System RAM below 0x100000,
>> but I think that's a mistake.
>
> What you think is a "mistake" is how /dev/mem has always worked.
>
> /dev/mem gave access to all the memory of the system. That's LITERALLY
> the whole point of it. There was no "BIOS area" or anything else. It
> was access to physical memory.
>
> We've added limits to it, but those limits came later, and they came
> with the caveat that lots of programs used /dev/mem in various ways.
>
> Nobody was crazy enough to read /dev/mem one byte at a time trying to
> follow BIOS tables. No, the traditional way was to just map (or read)
> large chunks of it, and then follow the tables in the result. The
> easiest way was to just do the whole low 1MB.
>
> There's no "mistake" here. The only thing that is mistaken is you
> thinking that we can redefine reality and change history.

I'm not trying to rewrite history. :) I'm try to understand the
requirements for how the 1MB area was used, which you've explained the
history of now. (Thank you!)

> I already explained what the likely fix is: make devmem_is_allowed()
> return a ternary value, so that those things that *do* read the BIOS
> area can just continue to do so, but they see zeroes for the parts
> that the kernel has taken over.

Sounds good to me. I'll go work on that.

-Kees

--
Kees Cook
Pixel Security