RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

From: Schaufler, Casey
Date: Tue Sep 04 2018 - 21:01:31 EST


> -----Original Message-----
> From: Andrea Arcangeli [mailto:aarcange@xxxxxxxxxx]
> Sent: Tuesday, September 04, 2018 4:37 PM
> To: Schaufler, Casey <casey.schaufler@xxxxxxxxx>
> Cc: Jiri Kosina <jikos@xxxxxxxxxx>; Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>;
> Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>;
> Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Josh Poimboeuf
> <jpoimboe@xxxxxxxxxx>; Woodhouse, David <dwmw@xxxxxxxxxxxx>; Oleg
> Nesterov <oleg@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
>
> Hello,
>
> On Tue, Sep 04, 2018 at 06:10:47PM +0000, Schaufler, Casey wrote:
> > The real reason to use an LSM based approach is that overloading ptrace
> > checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a
> > processor interface. Even if ptrace_may_access() does exactly what you
>
> "Side channel is a processor interface" doesn't make me optimistic,
> but I assume you're not speaking for Intel.

Sorry, I've been working in security too long for my
optimistic streak to be very wide.

> > want it to for side-channel mitigation today it would be incredibly
> > inappropriate to tie the two together for eternity. You don't want to restrict
> > the ptrace checks to only those things that are also good for side-channel,
> > and vice versa.
>
> It seems like you want to make this more configurable,

Not especially, no. I have gotten considerable feedback that
it should be configurable, and while I may not see the value myself,
I do respect the people who've requested the configurability.

> we have all
> debugfs x86 specific tweaks to disable IBPB at runtime and we don't
> allow a runtime opt-out of IBPB alone.
>
> If you shutdown either IBRS or retpolines at runtime with debugfs,
> then IBPB goes away too.
>
> Giving a finegrined way to disable only IBPB we found was overkill
> because IBPB has never been measurable if done only when the prev task
> cannot ptrace the next task (which is a superset of the too weak
> upstream not dumpable check, but still not a performance issue).
>
> Allowing IBPB runtime opt-out doesn't make much sense if you don't
> allow to disable retpolines too still at runtime. And disabling
> retpolines from LSM doesn't sounds the right place, it's an x86
> temporary quirk only relevant for current buggy CPUs.
>
> There should be a function that decides when IBPB and flush_RSB should
> be run (flush_RSB has to be run if SMEP because with SMEP there's no
> need to run flush_RSB at every kernel entry anymore), and that
> function happens to check ptrace today, but it would check other stuff
> too if we had other interfaces besides ptrace that would allow the
> prev task to read the memory of the next task. So it's not so much
> about ptrace nor about IBPB, it's about "IBPB&flush_RSB" vs "prev task
> can read memory of next task". Then each arch can implement
> "IBPB&flush_RSB" method its own way but the check is for the common
> code and it should be in the scheduler and there's just 1 version of
> this check needed.

Right, I get that. There's more to ptrace access than "I can read the
other task". There's more to what the processor is up to than IBPB.

>
> I don't think there should be a ton of different versions of this
> function each providing a different answer, which is what LSM would
> provide.

If they provide different answers there should be different
functions. It's a question of viewpoint. If you don't care about
the SELinux viewpoint you shouldn't have to include it in your
checks, any more than you care about x86 issues when you're
running on a MIPS processor.

> At most you can add a x86 debugfs tweak to shut off the logic but
> again it would make more sense if retpolines could be shut off at
> runtime too, doing it just for IBPB sounds backwards because it has
> such an unmeasurable overhead.
>
> > Yes. That would be me.
>
> Even the attempt to make an innocuous call like
> ptrace_has_cap(tcred->user_ns, mode) will eventually
> deadlock there.

Which is yet another reason there needs to be separate logic.

> I used a PTRACE_MODE_ check that the ptrace check code uses to filter
> out specific calls that may eventually enter LSM and hard lockup in
> non reproducible workloads (some false positive IBPB is ok, especially
> if it avoids a deadlock).

Yes, even security people have to worry about locking.

> Everything can be fixed in any way, but calling LSM from scheduler
> code doesn't sound the most robust thing to do in general because what
> works outside the scheduler doesn't work from within those semi atomic
> regions (it tends to work initially until it eventually
> deadlocks). The original code of such check, had all sort of deadlocks
> because of that.

What *is* the most robust way?
Yes, there are locking issues. The code in the LSM infrastructure and in
the security modules needs to be aware of that and deal with it. The SELinux
code I proposed is more complex than it could be because the audit code
requires locking that doesn't work in the switching context.

> Thanks,
> Andrea