Re: regression in 4.14-rc2 caused by apparmor: add base infastructure for socket mediation

From: John Johansen
Date: Tue Oct 03 2017 - 02:32:09 EST


On 10/02/2017 10:15 PM, James Bottomley wrote:
> On Mon, 2017-10-02 at 21:11 -0700, John Johansen wrote:
>> On 10/02/2017 09:02 PM, James Bottomley wrote:
>>>
>>> The specific problem is that dnsmasq refuses to start on openSUSE
>>> Leap 42.2. The specific cause is that and attempt to open a
>>> PF_LOCAL socket gets EACCES. This means that networking doesn't
>>> function on a system with a 4.14-rc2 system.
>>>
>>> Reverting commit 651e28c5537abb39076d3949fb7618536f1d242e
>>> (apparmor: add base infastructure for socket mediation) causes the
>>> system to function again.
>>>
>>
>> This is not a kernel regression,
>
> Regression means something that worked in a previous version of the
> kernel which is broken now. This problem falls within that definition.
>

sure, its a regression for suse based system. It isn't however a
regression in the kernel code or interface. It makes the information
available, its a matter of how the user space and policy are
configured.

It is entirely possible to use the 4.14 kernel on suse without having
to modify policy if the policy version/feature set is pinned. However
this is not a feature that suse seems to be using. Instead suse policy
is tracking and enforcing all kernel supported features when they
become available, regardless of whether the policy has been updated.

This makes sense for a policy developers machine, not so much for a
general user. I will have to discuss this with Christian and Goldwyn.


>> it is because opensuse dnsmasque is starting with policy that
>> doesn't allow access to PF_LOCAL socket
>
> Because there was no co-ordination between their version of the patch
> and yours. If you're sending in patches that you know might break
> systems because they need a co-ordinated rollout of something in
> userspace then it would be nice if you could co-ordinate it ...
>

This information was communicated more than once. That is not to say
there were not issues with the landing or else you would not have seen
this. In fact I would say this particular sync was handled poorly and
we as an upstream certainly have to take some of the blame for it.

The userspace that supported the 4.14 kernel changes landed long
ago. It was specific policy updates that were missing.

Ideally your policy would have been pinned to a specific kernel
feature set, so that kernel changes would not have resulted in policy
issues.

> Doing it in the merge window and not in -rc2 would also be helpful
> because I have more expectation of a userspace mismatch from stuff in
> the merge window.
>

Certainly and this would have landed during the merge window except
for an issue with the security tree. This particular series lived in
-next for several weeks before landing and I would have never asked
for it to have been pulled as late as it was except for the issue
around the security tree this last cycle.

>> Christian Boltz the opensuse apparmor maintainer has been working
>> on a policy update for opensuse see bug
>>
>> https://bugzilla.opensuse.org/show_bug.cgi?id=1061195
>
> Well, that looks really encouraging: The line about "To give you an
> impression what "lots of" means - I had to adjust 40 profiles on my
> laptop". The upshot being apart from a bandaid, openSUSE still has no
> co-ordinated fix for this.
>

yes, it is a change that affects policy, the same can be said for any
other MAC system when new mediation is added. It can be fixed by either
configuring the feature set/version that policy is targeting or updating
policy.

For policy changes this particular change it can mostly be fixed by an
adjustment to the abstractions. The bandaid referenced has to do with
Christian choosing to use only what is supported in 4.14 instead of
the upstream solution which contains rules for work targeted beyond
4.14, even though userspace supports those rules already and will
compile them to a policy that works in 4.14.

However Christian wants to update the suse policy using the 4.14
kernel because he does not feel that he can properly verify the
upstream policy changes on suse with 4.14. This is an understandable
stance for him to take, but it does mean there is some disconnect
between what is in the upstream apparmor project and what is in suse.

Regardless this is a change that you shouldn't have noticed, so its
obvious the coordination was off and needs to be improved.