Re: [PATCH 4.10 070/111] audit: fix auditd/kernel connection state tracking

From: Peter Zijlstra
Date: Tue Feb 20 2018 - 10:19:04 EST


On Tue, Feb 20, 2018 at 09:51:08AM -0500, Paul Moore wrote:
> On Tue, Feb 20, 2018 at 9:06 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> > It's not at all clear to me what that code does, I just stumbled upon
> > __mutex_owner() outside of the mutex code itself and went WTF.
>
> If you don't want people to use __mutex_owner() outside of the mutex
> code I might suggest adding a rather serious comment at the top of the
> function, because right now I don't see anything suggesting that
> function shouldn't be used. Yes, there is the double underscore
> prefix, but that can mean a few different things these days.

Find below.

> > The comment (aside from having the most horribly style) ...
>
> Yeah, your dog is ugly too. Notice how neither comment is constructive?

I'm sure you've seen this one:

https://lkml.org/lkml/2016/7/8/625

It's all about reading code; inconsistent and unbalanced styles are just
_really_ hard on the brain.

> > ... is wrong too, because it claims it will not block when we hold that lock, while,
> > afaict, it will in fact do just that.
>
> A mutex blocks when it is held, but the audit_log_start() function
> should not block for the task that currently holds the
> audit_cmd_mutex; that is what the comment is meant to convey. I
> believe the comment makes sense, but I did write it so I'll concede
> that I'm probably the not best judge. If anyone would like to offer a
> different wording I'm happy to consider it.

The comment uses 'sleep' which is typically used to mean anything that
schedules, but then it does the schedule_timeout() thing.

> > Maybe if you could explain how that code is supposed to work and why it
> > doesn't know if it holds a lock I could make a suggestion...
>
> I just spent a few minutes looking back over the bits available in
> include/linux/mutex.h and I'm not seeing anything beyond
> __mutex_owner() which would allow us to determine the mutex owning
> task. It's probably easiest for us to just track ownership ourselves.
> I'll put together a patch later today.

Note that up until recently the mutex implementation didn't even have a
consistent owner field. And the thing is, it's very easy to use wrong,
only today I've seen a patch do: "__mutex_owner() == task", where task
was allowed to be !current, which is just wrong.

Looking through kernel/audit.c I'm not even sure I see how you would end
up in audit_log_start() with audit_cmd_mutex held.

Can you give me a few code paths that trigger this? Simple git-grep is
failing me.


---
Subject: mutex: Add comment to __mutex_owner()
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Tue Feb 20 16:01:36 CET 2018

Attempt to deter usage, this is not a public interface. It is entirely
possibly to implement a conformant mutex without having this owner
field (in fact, we used to have that).

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -66,6 +66,11 @@ struct mutex {
#endif
};

+/*
+ * Internal helper function; C doesn't allow us to hide it :/
+ *
+ * DO NOT USE (outside of mutex code).
+ */
static inline struct task_struct *__mutex_owner(struct mutex *lock)
{
return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);