Re: Boot regression caused by kauditd

From: Paul Moore
Date: Fri Apr 28 2017 - 11:30:43 EST


On Thu, Apr 27, 2017 at 8:47 PM, Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> In that case please send a proper inline patch to the audit mailing list
> and we'll review it.
>
> Thanks.

Now that I'm back in front of a proper screen/keyboard I've been
looking over your patch and while you are very right in that the
current RCU usage is very wrong, there are quite a few things I would
like to see changed in your patch ... I'm working on something right
now, I'll post an RFC draft to the audit list and CC you once I get
this sorted out, expect something in a few hours.

Also, once you've had a look at this new patch, and assuming you are
okay with it, I'd like to add your sign-off to it. This may not be
your patch exactly, but a significant portion of it is borrowed from
your patch yesterday.

> On April 27, 2017 7:41:45 PM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>
>> On Thu, Apr 27, 2017 at 3:38 PM, Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
>>> On Thu, Apr 27, 2017 at 5:45 PM, Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>>>> On Thu, Apr 27, 2017 at 2:35 PM, Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>>>>> On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>>>>>> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
>>>>>>> Thanks for the report, this is the only one like it that I've seen.
>>>>>>> I'm looking at the code in Linus' tree and I'm not seeing anything
>>>>>>> obvious ... looking at the trace above it appears that the problem is
>>>>>>> when get_net() goes to bump the refcount and the passed net pointer is
>>>>>>> NULL; unless I'm missing something, the only way this would happen in
>>>>>>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
>>>>>>> auditd_conn.net pointer is NULL.
>>>>>>>
>>>>>>> That shouldn't happen.
>>>>>>>
>>>>>>
>>>>>> Looking at the code that reads/writes the global auditd_conn,
>>>>>> I don't see how it even works with RCU+spinlock, RCU plays
>>>>>> with pointers and you have to make a copy as its name implies.
>>>>>> But it looks like you simply use RCU+spinlock as a traditional
>>>>>> rwlock, it doesn't work.
>>>>>
>>>>> The attached patch seems working for me, I tried to boot my
>>>>> VM for 4 times, so far no crash or warning.
>>>>>
>>>>
>>>> Or even better, save a memory allocation for reset path...
>>>
>>> I need to step away from my laptop for the evening so I can't give
>>> this a proper review until tomorrow (sending patches as attachments
>>> makes it difficult to review), but on quick glance I did notice a few
>>> small things I would like to see changed. However, since there is no
>>> normal commit description and sign-off, I'm guessing you sent these
>>> out as a suggestion and not a proper patch submission, yes/no? If
>>> that's the case, I'll work up a proper fix tomorrow and share it with
>>> you for comment/review, but if you were planning on sending a proper
>>> patch let me know and I'll wait until I see something in my inbox from
>>> you.
>>
>> I want you to give it sanity check before I submit a formal one. ;)
>> If you don't reject it, I will send a formal one with description and SoB.
>>
>> Thanks.
>
>



--
paul moore
www.paul-moore.com