Re: [PATCH 1/2] sysctl: fix incorrect write position handling

From: Kees Cook
Date: Wed Mar 19 2014 - 19:35:48 EST


On Wed, Mar 19, 2014 at 3:25 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 18 Mar 2014 10:17:59 -0700 Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
>> When writing to a sysctl string, each write, regardless of VFS position,
>> began writing the string from the start. This meant the contents of
>> the last write to the sysctl controlled the string contents instead of
>> the first:
>>
>> open("/proc/sys/kernel/modprobe", O_WRONLY) = 1
>> write(1, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"..., 4096) = 4096
>> write(1, "/bin/true", 9) = 9
>> close(1) = 0
>>
>> $ cat /proc/sys/kernel/modprobe
>> /bin/true
>
> Yes, procfs writes have always been weird.
>
> Why are we fixing this?

This misbehavior was featured in an exploit against Chrome OS. While
it's not in itself a vulnerability, it's a weirdness that isn't on the
mind of most auditors. "This filter looks correct, it wouldn't resolve
to a real path." doesn't apply here, since the size of the write and
the contents of the final write are what matter when writing to
procfs.

> Perhaps there's an existing application which holds an fd open
> and periodically writes to it expecting some result. This patch
> would break such an app?
>
>
> And what about something like this?
>
> while true
> do
> echo 1
> sleep 60
> done > /proc/sys/vm/drop_caches

Yes, my current change would break this. I could remove the changes to
the numeric values?

> To be consistent with the proposed alteration to string writes, this
> would have to write 1 then change that to 11 then to 111. Or
> something, makes my head spin.

Well, no, the procfs writes stop processing at newlines.

>> Expected behaviour would be to have the sysctl be "AAAA..." capped at
>> maxlen (in this case KMOD_PATH_LEN: 256), instead of truncating to the
>> contents of the second write. Similarly, multiple short writes would not
>> append to the sysctl.
>
> So if we do
>
> (
> echo foo
> echo bar
> ) > /proc/sys/kernel/modprobe
>
> we get foo and later we get foobar?

No. Currently you get "bar" due to line endings. After, the change, it
would only be "foo" since the file position would be beyond "foo"
(buffer holds 3 bytes, write wrote 4 (with newline), so the next write
would be ignored).

>> This fixes the unexpected behavior for strings, and disallows non-zero
>> file position when writing numeric sysctls (similar to what is already
>> done when reading from non-zero file positions).
>
> So my script which writes drop_caches will break?

Yes, it would. When considering this, it seemed to me that people
would write such a script as:

while true
do
echo 1 > /proc/sys/vm/drop_caches
sleep 60
done

The behavior is pretty weird, and it seems like something we should
fix. My specific case is to fix the situation of writing garbage
followed by real information and allowing the write length size to
dictate what is being thrown away.

I wonder if there is a solution here involving the existing
acknowledgement of \0 or \n in string writes? The case I want to stop
lacked \0 or \n until the end ("AAAA..../bin/true\n") but the
write-length chunking meant the garbage part got thrown away instead
of correctly appending.

Should \0 and \n really mean "start again" here, and that and "open"
should be the only thing that rewinds the buffer to index 0? That way
"(echo foo; echo bar) > /proc/..." works as "bar", but "(echo -n foo;
echo bar) > /proc/..." results in "foobar". I'm not sure that's any
less an unexpected behavior, though.

-Kees

--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/