Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

From: jmerkey
Date: Thu Aug 21 2008 - 10:34:46 EST


> jmerkey@xxxxxxxxxxxxxxxxxxxxx wrote:
>> It's simple to reproduce. Take away the volatile declaration for the
>> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
>> references and watch the thing lock up in SMP with multiple processors
>> in
>> the debugger each stuck with their own local copy of debug_lock.
>
> I'm having a quick look at mdb-2.6.27-rc4-ia32-08-20-08.patch at the
> moment. Speaking of debug_lock()...:
>
> You use spin_trylock_irqsave((spinlock_t *)&rlock->lock, rlock->flags)
> in there. Minor nit: The pointer type cast is unnecessary.
>
> Major problem: rlock->flags is wrong in this call. Use an on-stack
> flags variable for the initial spin_trylock_irqsave. Ditto in the
> following call of spin_trylock_irqsave.
>
> Next major problem with debug_lock() and debug_unlock(): The reference
> counting doesn't work. You need an atomic_t counter. Have a look at
> the struct kref accessors for example, or even make use of the kref API.
> Or if it isn't feasible to fix with atomic_t, add a second spinlock to
> rlock_t to ensure integrity of .count (and of the .processor if
> necessary).
>
> Furthermore, I have doubts about the loop which is entered by CPU B
> while CPU A holds the rlock. You are fully aware that atomic_read(a) &&
> !atomic_read(b) in its entirety is not atomic, I hope.
>
> All this aside, I don't see *anything* in debug_lock and _unlock which
> would necessitate volatile. Well, volatile might have papered over some
> of these bugs.
>
> PS:
> Try to cut down on #if/#endif clutter. It should be possible to reduce
> them at least in .c files; .h are a different matter. For example,
> #if MDB_DEBUG_DEBUGGER
> DBGPrint("something");
> #endif
> can be trivially reduced to
> dbg_mdb_printk("something");
> where dbg_mdb_printk() is defined as an inline function which does
> nothing when MDB_DEBUG_DEBUGGER is false.
>
> PS2:
> Why are there this many debug printks anyway?
> --
> Stefan Richter
> -=====-==--- =--- =-=-=
> http://arcgraph.de/sr/
>

The code works in debug lock provided this memory location is actually
SHARED between the processors. The various race conditions you describe
are valid cases, but he only modifier of .count and .lock is the processor
that obtains the spinlock -- the rest are readers. This code works well,
of course, when this memory location is actually SHARED between the
processors and the read/write operations serialized.

Even in SMP, at various times it is necessary for the processors to
perform serializing operations. You cannot in checker-scoreboard land for
everything.

Jeff

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/