Re: [PATCH v2] clear file privilege bits when mmap writing

From: Kees Cook
Date: Wed Dec 09 2015 - 17:52:57 EST


On Wed, Dec 9, 2015 at 12:26 AM, Jan Kara <jack@xxxxxxx> wrote:
> On Mon 07-12-15 16:40:14, Kees Cook wrote:
>> On Mon, Dec 7, 2015 at 2:42 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> > On Thu, Dec 3, 2015 at 5:45 PM, yalin wang <yalin.wang2010@xxxxxxxxx> wrote:
>> >>
>> >>> On Dec 2, 2015, at 16:03, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> >>>
>> >>> Normally, when a user can modify a file that has setuid or setgid bits,
>> >>> those bits are cleared when they are not the file owner or a member
>> >>> of the group. This is enforced when using write and truncate but not
>> >>> when writing to a shared mmap on the file. This could allow the file
>> >>> writer to gain privileges by changing a binary without losing the
>> >>> setuid/setgid/caps bits.
>> >>>
>> >>> Changing the bits requires holding inode->i_mutex, so it cannot be done
>> >>> during the page fault (due to mmap_sem being held during the fault).
>> >>> Instead, clear the bits if PROT_WRITE is being used at mmap time.
>> >>>
>> >>> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
>> >>> Cc: stable@xxxxxxxxxxxxxxx
>> >>> â
>> >>
>> >> is this means mprotect() sys call also need add this check?
>> >> mprotect() can change to PROT_WRITE, then it can write to a
>> >> read only map again , also a secure hole here .
>> >
>> > Yes, good point. This needs to be added. I will send a new patch. Thanks!
>>
>> This continues to look worse and worse.
>>
>> So... to check this at mprotect time, I have to know it's MAP_SHARED,
>> but that's in the vma_flags, which I can only see after holding
>> mmap_sem.
>>
>> The best I can think of now is to strip the bits at munmap time, since
>> you can't execute an mmapped file until it closes.
>>
>> Jan, thoughts on this?
>
> Umm, so we actually refuse to execute a file while someone has it open for
> writing (deny_write_access() in do_open_execat()). So dropping the suid /
> sgid bits when closing file for writing could be plausible. Grabbing
> i_mutex from __fput() context is safe (it gets called from task_work
> context when returning to userspace).
>
> That way we could actually remove the checks done for each write. To avoid
> unexpected removal of suid/sgid bits when someone just opens & closes the
> file, we could mark the file as needing suid/sgid treatment by a flag in
> inode->i_flags when file gets written to or mmaped and then check for this
> in __fput().

Yeah, this is ultimately where I ended up for the v4 (and fixed up in
v5). I added the flag to file, though, not inode. Sending v5 now...

-Kees

>
> I've added Al Viro to CC just in case he is aware of some issues with
> this...
>
> Honza
> --
> Jan Kara <jack@xxxxxxxx>
> SUSE Labs, CR



--
Kees Cook
Chrome OS & Brillo Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/