Re: [inotify] fee1df54b6: BUG_kmalloc-#(Not_tainted):Freepointer_corrupt

From: Andrey Vagin
Date: Tue Dec 13 2016 - 17:18:23 EST


On Tue, Dec 13, 2016 at 11:34 AM, Nikolay Borisov
<n.borisov.lkml@xxxxxxxxx> wrote:
>
>
> On 13.12.2016 20:51, Eric W. Biederman wrote:
>> Nikolay Borisov <n.borisov.lkml@xxxxxxxxx> writes:
>>
>>> So this thing resurfaced again and I took a hard look into the code but
>>> couldn't find anything suspicious. So the allocating and freeing
>>> contexts leads me to believe it's the 'tbl' pointer that is being
>>> corrupted. The only thing which I do with it is to increase it by two.
>>>
>>> Perhaps some liveness issues.
>>
>> To me it feels like a double free somewhere. Like we call dec_ucount
>> and thus put_ucount multiple times in a way that goes to 0.
>>
>> Perhaps there is a peculiarity in the existing code which allows the
>> count to go to zero which we don't notice because we don't free anything
>> when the count goes to zero today.
>>
>> Perhaps there is some subtle semantic mismatch between your conversion
>> and the inotify code.
>>
>> I don't know if you made a subtle misreading of the code, or if
>> there is an existing bug that your changes took from harmless to
>> problematic, but the evidence is overwhelming that something
>> is going wrong and it is your patch that brings it out.
>>
>> If it helps the openvz folks apparently reproduced this with the criu
>> regression tests and the appropriate kernel debug options, and confirmed
>> the failure was your patch.
>
> Great but I think I missed this conversation, care to send relevant
> threads? I'd like to get to the bottom of this and have it merged?
>
> @openvz guys - if you care to shout with more details I'd love to work
> on getting this fixed!

Hi Nikolay,

We execute CRIU tests for linux-next and a few days ago they triggered
a kernel bug:
http://www.spinics.net/lists/linux-mm/msg118204.html

If you want to execute these tests to reproduce a bug, you need to do
these steps:

$ apt-get install gcc make protobuf-c-compiler libprotobuf-c0-dev libaio-dev \
libprotobuf-dev protobuf-compiler python-ipaddr libcap-dev \
libnl-3-dev gdb bash python-protobuf
$ git clone https://github.com/xemul/criu.git
$ cd criu
$ make
$ python test/zdtm.py run -a -p 4

Here is a config file, which we use to compile a kernel:
https://github.com/avagin/criu-jenkins-digitalocean/blob/master/jenkins-scripts/config

I recommend to boot the kernel with slub_debug=FZ.

Don't hesitate to ask me if you will have any questions.

Thanks,
Andrei
>
>>
>> The current state of play is that I would love to merge this if we can
>> track down this issue. I dropped this from my tree before I sent my pull
>> request to Linus so there is no emergency to get this fixed.
>>
>> Eric
>>
>>
> _______________________________________________
> Containers mailing list
> Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
> https://lists.linuxfoundation.org/mailman/listinfo/containers