Re: [PATCH RFC] uaccess: user_access_begin_after_access_ok()
From: Linus Torvalds
Date: Wed Jun 03 2020 - 13:00:12 EST
[ Just a re-send without html and a few fixes for mobile editing,
since that email got eaten by the mailing list Gods ]
On Tue, Jun 2, 2020, 23:02 Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
>
> Right and we do that, but that still sets the segment according to the
> current thread's flags, right?
But that shouldn't matter.
Sure, the limit might be for a 64-bit task, but it's not like anybody
really cares for the access. The "good address but I should limit user
space mappings to 32-bit" only matters for creating new mappings, not
for normal accesses to user space.
In fact, your very quotes show this effect:
> #define USER_DS MAKE_MM_SEG(TASK_SIZE_MAX)
Look, to above is what set_fs(USER_DS) will use: always the max address.
Because access_ok() doesn't care. It just checks that it's any user
address at all, and the "is this mapped" is then encoded in the
existing page tables and vma lists.
So no, the current threads flags shouldn't matter for
usrr_access_begin() and unsafe_get/put_user() at all.
Where do you see them mattering?
In contrast, some things then do take the "I'm a 32-bit app" into
account, and they may use TASK_SIZE for limit checking, but on the
whole they should be very very rare. Things like "mmap()" etc, but
that's irrelevant for this discussion. But that's why you then have:
> #define TASK_SIZE (test_thread_flag(TIF_ADDR32) ? \
> IA32_PAGE_OFFSET : TASK_SIZE_MAX)
Which actually makes that choice.
(Admittedly that's a horrible horrible hack, and we should long since
have stopped doing that hiding inside the #define, but nobody has had
the energy to make it explicit in the mmap paths, I think)
> so if this is run from a kernel thread on a 64 bit kernel, we get
> TASK_SIZE_MAX even if we got the pointer from a 32 bit userspace
> address.
But that's what *normal* access_ok() does too. TASK_SIZE_MAX is fine.
All it needs to check is that it isn't a kernel address.
> Maybe kthread_use_mm should also get the fs, not just mm.
> Then we can just use access_ok directly before the access.
I'm entirely missing why you think we should care about the fs side.
Again, an access shouldn't care, either at access_ok() time, at
user_access_begin() time, or at actual user access itself. We've got
everything set up in the page tables and the vma information.
Can you point to what I'm missing in this discussion?
Linus