Re: [PATCH v2 2/2] audit: Mark CONFIG_AUDITSYSCALL BROKEN and update help text
From: Andy Lutomirski
Date: Wed May 28 2014 - 22:42:49 EST
On Wed, May 28, 2014 at 7:09 PM, Eric Paris <eparis@xxxxxxxxxx> wrote:
> NAK
>
> On Wed, 2014-05-28 at 18:44 -0700, Andy Lutomirski wrote:
>> Here are some issues with the code:
>> - It thinks that syscalls have four arguments.
>
> Not true at all. It records the registers that would hold the first 4
> entries on syscall entry, for use later if needed, as getting those
> later on some arches is not feasible (see ia64). It makes no assumption
> about how many syscalls a function has.
What about a5 and a6?
>
>> - It's a performance disaster.
>
> Only if you enable it. If you don't use audit it is a single branch.
> Hardly a disaster.
It forces all syscalls into the slow path and it can do crazy things
like building audit contexts just in case actual handling of the
syscall triggers an audit condition so that the exit path can log the
syscall. That's way worse than a single branch.
Try it: benchmark getpid on Fedora and then repeat the experiment with
syscall auditing fully disabled. The difference is striking.
>
>> - It assumes that syscall numbers are between 0 and 2048.
>
> There could well be a bug here. Not questioning that. Although that
> would be patch 1/2
Even with patch 1, it still doesn't handle large syscall numbers -- it
just assumes they're not audited.
>
>> - It's unclear whether it's supposed to be reliable.
>
> Unclear to whom?
To me.
If some inode access or selinux rule triggers an audit, is the auditsc
code guaranteed to write an exit record? And see below...
>
>> - It's broken on things like x32.
>> - It can't support ARM OABI.
>
> Some arches aren't supported? And that makes it BROKEN?
It acts like x32 is supported. OABI is fixable. Admittedly, OABI not
being supported is fine, now that it's correctly disabled.
>
>> - Its approach to freeing memory is terrifying.
>
> What?
>
> None of your reasons hold water. Bugs need to be fixed. Try reporting
> them... This is just stupid.
...for reference, I've *tried* to fix the performance issues. I've
run into all kinds of problems.
The actual context code is incredibly tangled. It's unclear whether
it would be permissible for various rule combinations to suppress exit
records triggered by selinux. Any effort to actually deal with this
stuff will have to deal with the fact that the audit code builds up
lots of state and frees it on syscall exit. That means that the exit
hook needs to be actually invoked, otherwise we leak. I was never
able to convince myself that the current code is correct wrt kernel
threads.
In summary, the code is a giant mess. The way it works is nearly
incomprehensible. It contains at least one severe bug. I'd love to
see it fixed, but for now, distributions seem to think that enabling
CONFIG_AUDITSYSCALL is a reasonable thing to do, and I'd argue that
it's actually a terrible choice for anyone who doesn't actually need
syscall audit rules. And I don't know who needs these things.
I've heard an argument that selinux benefits from this, since the
syscall exit log helps with diagnosing denials. I think that's
absurd. I've never found anything wrong with the denial record itself
that would be helped by seeing the syscall log. (And there's always
syscall_get_xyz, which would cover this case splendidly with about an
order of magnitude less code.)
As I said, I'm not going to push hard for this code to be marked
BROKEN. But I think it's rather broken, and I'm definitely not
volunteering to fix it.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/