Re: [PATCH v4] proc: Allow pid_revalidate() during LOOKUP_RCU
From: Paul Moore
Date: Tue Jan 05 2021 - 19:02:13 EST
On Tue, Jan 5, 2021 at 6:27 PM Stephen Brennan
<stephen.s.brennan@xxxxxxxxxx> wrote:
> Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:
>
> > On Tue, Jan 05, 2021 at 04:50:05PM +0000, Al Viro wrote:
> >
> >> LSM_AUDIT_DATA_DENTRY is easy to handle - wrap
> >> audit_log_untrustedstring(ab, a->u.dentry->d_name.name);
> >> into grabbing/dropping a->u.dentry->d_lock and we are done.
> >
> > Incidentally, LSM_AUDIT_DATA_DENTRY in mainline is *not* safe wrt
> > rename() - for long-named dentries it is possible to get preempted
> > in the middle of
> > audit_log_untrustedstring(ab, a->u.dentry->d_name.name);
> > and have the bugger renamed, with old name ending up freed. The
> > same goes for LSM_AUDIT_DATA_INODE...
>
> In the case of proc_pid_permission(), this preemption doesn't seem
> possible. We have task_lock() (a spinlock) held by ptrace_may_access()
> during this call, so preemption should be disabled:
>
> proc_pid_permission()
> has_pid_permissions()
> ptrace_may_access()
> task_lock()
> __ptrace_may_access()
> | security_ptrace_access_check()
> | ptrace_access_check -> selinux_ptrace_access_check()
> | avc_has_perm()
> | avc_audit() // note that has_pid_permissions() didn't get a
> | // flags field to propagate, so flags will not
> | // contain MAY_NOT_BLOCK
> | slow_avc_audit()
> | common_lsm_audit()
> | dump_common_audit_data()
> task_unlock()
>
> I understand the issue of d_name.name being freed across a preemption is
> more general than proc_pid_permission() (as other callers may have
> preemption enabled). However, it seems like there's another issue here.
> avc_audit() seems to imply that slow_avc_audit() would sleep:
>
> static inline int avc_audit(struct selinux_state *state,
> u32 ssid, u32 tsid,
> u16 tclass, u32 requested,
> struct av_decision *avd,
> int result,
> struct common_audit_data *a,
> int flags)
> {
> u32 audited, denied;
> audited = avc_audit_required(requested, avd, result, 0, &denied);
> if (likely(!audited))
> return 0;
> /* fall back to ref-walk if we have to generate audit */
> if (flags & MAY_NOT_BLOCK)
> return -ECHILD;
> return slow_avc_audit(state, ssid, tsid, tclass,
> requested, audited, denied, result,
> a);
> }
>
> If there are other cases in here where we might sleep, it would be a
> problem to sleep with the task lock held, correct?
I would expect the problem here to be the currently allocated audit
buffer isn't large enough to hold the full audit record, in which case
it will attempt to expand the buffer by a call to pskb_expand_head() -
don't ask why audit buffers are skbs, it's awful - using a gfp flag
that was established when the buffer was first created. In this
particular case it is GFP_ATOMIC|__GFP_NOWARN, which I believe should
be safe in that it will not sleep on an allocation miss.
I need to go deal with dinner, so I can't trace the entire path at the
moment, but I believe the potential audit buffer allocation is the
main issue.
--
paul moore
www.paul-moore.com