Re: [PATCH v3 8/9] x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths

From: Eric Dumazet
Date: Wed Jan 17 2018 - 15:01:43 EST


On Wed, Jan 17, 2018 at 11:26 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Jan 17, 2018 at 6:17 AM, Alan Cox <alan@xxxxxxxxxxxxxxx> wrote:
>>
>> Can we kill off the remaining users of set_fs() ?
>
> I would love to, but it's not going to happen short-term. If ever.
>
> Some could be removed today: the code in arch/x86/net/bpf_jit_comp.c
> seems to be literally the ramblings of a diseased mind. There's no
> reason for the set_fs(), there's no reason for the
> flush_icache_range() (it's a no-op on x86 anyway), and the smp_wmb()
> looks bogus too.
>
> I have no idea how that braindamage happened, but I assume it got
> copied from some broken source.

At the time commit 0a14842f5a3c0e88a1e59fac5c3025db39721f74 went in,
this was the first JIT implementation for BPF, so maybe I wanted to avoid
other arches to forget to flush icache : You bet that my implementation served
as a reference for other JIT.

At that time, various calls to flush_icache_range() were definitely in arch/x86
or kernel/module.c

(I believe I must have copied the code from kernel/module.c, but that
I am not sure)

>
> But there are about ~100 set_fs() calls in generic code, and some of
> those really are pretty fundamental. Doing things like "kernel_read()"
> without set_fs() is basically impossible.
>
> We've had set_fs() since the beginning. The naming is obviously very
> historical. We have it for a very good reason, and I don't really see
> that reason going away.
>
> So realistically, we want to _minimize_ set_fs(), and we might want to
> make sure that it's done only in limited settings (it might, for
> example, be a good idea and a realistic goal to make sure that drivers
> and modules can't do it, and use proper helper functions like that
> "read_kernel()").
>
> But getting rid of the concept entirely? Doesn't seem likely.
>
> Linus