Re: br_read_lock SMP race fix

From: Andrea Arcangeli (andrea@suse.de)
Date: Wed Jul 26 2000 - 10:00:40 EST

Next message: Linus Torvalds: "Re: [PATCH] removal of unnecessary irq save/restore in tasklet_hi_schedule"
Previous message: Alan Cox: "Re: Do ramdisk exec's map direct to buffer cache?"
In reply to: David S. Miller: "Re: br_read_lock SMP race fix"
Next in thread: David S. Miller: "Re: br_read_lock SMP race fix"
Reply: David S. Miller: "Re: br_read_lock SMP race fix"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 26 Jul 2000, David S. Miller wrote:

> Date: Tue, 25 Jul 2000 04:40:02 +0200 (CEST)
> From: Andrea Arcangeli <andrea@suse.de>
>
> This fixes a SMP race condition in the non atomic version of the
> br_read_lock (at least with the alpha semantics of rmb()) and removes a
> rmb() not necessary in the slow path.
>
>Your code does nothing more than change a rmb() to a mb(), and you
>claim this fixes a bug particularly on Alpha. Then if one goes to

I don't claim this fixes a bug on Alpha (I'm sorry I was misleading with
"the alpha semantics of rmb"). What I meant alpha defines wmb this way:

        WMB guarantees that writes to memory-like regions that precede the
        WMB are ordered before writes to memory-like regions that follow
        the WMB.

so my own definition of RMB become:

        RMB guarantees that reads to memory-like regions that precede the
        RMB are ordered before reads to memory-like regions that follow
        the RMB.

So this code:

        (*ctr)++;
        rmb();
        if (spin_is_locked(lock)) {

is obviously wrong. Doing (*ctr)++ above rmb() means nothing. The CPU will
increse the per-cpu lock but it will flush the write buffer only after
checking spin_is_locked. This will break the ordering and the br lock
will break badly then.

The write lock operation was safe only because the spin_lock always imply
a mb (not rmb) barrier before continuing after the spin_lock.

>For the record, I believe it is correct on UltraSparc and that cpu has
>the most fine grained control of memory ordering desided in the
>rmb()/mb()/wmb() interfaces. It allows to specify these operations

If the semantics of the sparc64 rmb() aren't defined as I described above
I'd like to know.

>precisely. Therefore, Alpha must give the same or more strict
>semantics for these interfaces.

Of course, alpha have to fallback to asm mb in the rmb() case. No argument
about that.

>Finally, referring to the other part of your change, I believe the
>rmb() removal is wrong. The counter decrement memory operations must
>not get moved past the memory read in the spinning loop.

Even if that would be true the rmb() was not enforcing that. rmb() will
allow the counter decrement to be flushed to memory past the memory read
in the spinning loop. That's why I said the rmb() there is useless anyway.

I don't see why we need any ordering in the fallback path. The
spin_is_locked is just a slow path loop like in the normal spinlocks.
Nothing needs to be atomic there. We're just waiting the write lock to go
away and we are doing the decrement to make sure it will go away
eventually. The counter decrement will be flushed eventually thus no need
of wmb either. OTOH doing wmb (instead of the current wmb()) would have
the sense to make the change visible to the other CPUs ASAP but I don't
think it worth (the write lock case is not worth optimizing and I prefer
to save icache as said).

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: [PATCH] removal of unnecessary irq save/restore in tasklet_hi_schedule"
Previous message: Alan Cox: "Re: Do ramdisk exec's map direct to buffer cache?"
In reply to: David S. Miller: "Re: br_read_lock SMP race fix"
Next in thread: David S. Miller: "Re: br_read_lock SMP race fix"
Reply: David S. Miller: "Re: br_read_lock SMP race fix"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:21 EST