Re: [RFC V2] test_bit before clear files_struct bits
From: Kirill A. Shutemov
Date: Tue Feb 10 2015 - 17:46:16 EST
On Tue, Feb 10, 2015 at 12:49:46PM -0800, Linus Torvalds wrote:
> On Tue, Feb 10, 2015 at 12:22 PM, Andrew Morton
> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > The patch is good but I'm still wondering if any CPUs can do this
> > speedup for us. The CPU has to pull in the target word to modify the
> > bit and what it *could* do is to avoid dirtying the cacheline if it
> > sees that the bit is already in the desired state.
> Sadly, no CPU I know of actually does this. Probably because it would
> take a bit more core resources, and conditional writes to memory are
> not normally part of an x86 core (it might be more natural for
> something like old-style ARM that has conditional writes).
> Also, even if the store were to be conditional, the cacheline would
> have been acquired in exclusive state, and in many cache protocols the
> state machine is from exclusive to dirty (since normally the only
> reason to get a cacheline for exclusive use is in order to write to
> it). So a "read, test, conditional write" ends up actually being more
> complicated than you'd think - because you *want* that
> exclusive->dirty state for the case where you really are going to
> change the bit, and to avoid extra cache protocol stages you don't
> generally want to read the cacheline into a shared read mode first
> (only to then have to turn it into exclusive/dirty as a second state)
That all sounds resonable.
But I still fail to understand why my micro-benchmark is faster with
branch before store comparing to plain store.
In this case we would not have intermidiate shared state, because we don't
have anybody else to share cache line with. So with branch we would have
the smae E->M and write-back to memory as without branch. But it doesn't
explain why branch makes code faster.
Kirill A. Shutemov
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/