Re: linux-next: Tree for Apr 18 [ call-trace: drm | x86 | smp | rcurelated? ]

From: Sedat Dilek
Date: Sat Apr 20 2013 - 17:03:29 EST


On Sat, Apr 20, 2013 at 10:00 PM, Davidlohr Bueso
<davidlohr.bueso@xxxxxx> wrote:
> On Sat, 2013-04-20 at 02:19 +0200, Sedat Dilek wrote:
>> On Sat, Apr 20, 2013 at 2:06 AM, Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote:
>> > On Sat, Apr 20, 2013 at 1:02 AM, Linus Torvalds
>> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> >> On Fri, Apr 19, 2013 at 3:55 PM, Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote:
>> >>>
>> >>> Davidlohr pointed to this patch (tested the triplet):
>> >>>
>> >>> ipc, sem: do not call sem_lock when bogus sma:
>> >>> https://lkml.org/lkml/2013/3/31/12
>> >>>
>> >>> Is that what you mean?
>> >>
>> >> Yup.
>> >>
>> >
>> > Davidlohr Bueso (1):
>> > ipc, sem: do not call sem_lock when bogus sma
>> >
>> > Linus Torvalds (1):
>> > crazy rcu double free debug hack
>> >
>> > With ***both*** patches applied I am able to build a Linux-kernel with
>> > 4 parallel-make-jobs again.
>> > David's or your patch alone are not sufficient!
>> >
>>
>> [ Still both patches applied ]
>>
>> To correct myself... The 1st run was OK.
>>
>> The 2nd run shows a NULL-pointer-deref (excerpt):
>>
>> [ 178.490583] BUG: spinlock bad magic on CPU#1, sh/8066
>> [ 178.490595] lock: 0xffff88008b53ea18, .magic: 6b6b6b6b, .owner:
>> make/8068, .owner_cpu: 3
>> [ 178.490599] BUG: unable to handle kernel NULL pointer dereference
>> at (null)
>> [ 178.490608] IP: [<ffffffff812bacd0>] update_queue+0x70/0x210
>> [ 178.490610] PGD 0
>> [ 178.490612] Oops: 0000 [#1] SMP
>> ...
>
> The exit_sem() >> do_smart_update() >> update_queue() calls seem pretty
> well protected. Furthermore we're asserting that sma->sem_perm.lock is
> taken. This could just be a consequence of another issue. Earlier this
> week Andrew pointed out a potential race in semctl_main() where
> sma->sem_perm.deleted could be changed when cmd == GETALL.
>
> Sedat, could you try the attached patch to keep the ipc lock acquired
> (on top of the three patches you're already using) and let us know how
> it goes? We could also just have the RCU read lock instead of
> ->sem.perm.lock for GETALL, but lets play it safe for now.
>

I had to refresh your patch and it contain strange ^M charachters as well.

NOPE, the machine gets FROZEN (cold reboot).

- Sedat -

P.S.: I have attached all 4 patches against next-20130419 in case
someone wants to follow.

> Thanks,
> Davidlohr
>

Attachment: 0001-ipc-sem-untangle-RCU-locking-with-find_alloc_undo.patch
Description: Binary data

Attachment: 0002-ipc-sem-fix-lockdep-false-positive.patch
Description: Binary data

Attachment: 0003-ipc-sem-do-not-call-sem_lock-when-bogus-sma.patch
Description: Binary data

Attachment: 0004-ipc-sem-check-if-the-ipc-lock-has-been-already-taken.patch
Description: Binary data