dealing with barriers (was Re: [PATCH] firewire: fw-core: enforcewrite order when updating fw_device.generation)

From: Stefan Richter
Date: Thu Nov 01 2007 - 05:52:28 EST


Nick Piggin wrote:
> On Thursday 01 November 2007 12:49, Stefan Richter wrote:
>> fw_device.node_id and fw_device.generation are accessed without mutexes.
>> We have to ensure that all readers will get to see node_id updates
>> before generation updates.
>>
>
> Hi, a few points:

Thanks, this is appreciated.

> - can change it to use spinlocks instead? This would be most
> preferable.

It could be done, but I don't find it preferable for this purpose.
Although the struct members in question are basically part of fw-core's
API (kernel-internal API), there won't be a lot of places which access
these members.

(Besides, lockless algorithms are essential to FireWire driver code
everywhere where CPU and controller interact. So IMO these lockless
accesses don't constitute a whole new level of complexity in these drivers.)

> - if not, you need comments.

So far I tended to disagree every time when checkpatch.pl bugged me
about barriers without comments. But maybe that was because I had a
wrong idea of what kind of comment should go there. There would
certainly be no value in a comment which effectively says "trust me, I
know we need a barrier here". However, thinking about it again, it
occurs to me that I should add the following comments:

1.) At respective struct type definitions: A comment on how struct
members are to be accessed and why.

2.) At the occurrences of rmb() and wmb(): A comment on which
accesses the particular barrier divides. This is in order to get
the barriers properly updated (i.e. moved or removed) when
surrounding code is reordered, APIs reworked, or whatever.

Of course this division of the Why and the What only applies to accesses
like those in the patch --- i.e. API-like data types which aren't
abstracted by accessor functions --- while there may be other
requirements on comments for other usage types of barriers.

Or am I still wrong in my judgment of how barriers should be documented?

> - you also seem to be missing rmb()s now. I see a couple in the
> firewire directory, but nothing that seems to be ordering loads
> of these particular fields.

That's right. Of course the wmb()s don't make sense if the reader side
isn't fixed up accordingly. I did this for all places which I spotted
yesterday in the patches
"firewire: fw-core: react on bus resets while the config ROM is being
fetched", http://lkml.org/lkml/2007/10/31/464 (Hmm, this has an unclear
changelog.)
"firewire: fw-sbp2: enforce read order of device generation and node
ID", http://lkml.org/lkml/2007/10/31/465

I didn't fold all these patches into one because these two other patches
include also other changes to the respective read sides. I should have
pointed this out in this first patch.

Also, it could be that I have overlooked one more reader last night; I
will reexamine it.

> - use smp_*mb() if you are just ordering regular cacheable RAM
> accesses.

Hmm, right. Looks like the smp_ variants are indeed the right thing to
do here.

> Also, diffstat is a bit wrong... maybe you posted the wrong version?

Oops, this happened when I worked on what became the "fw-core: react on
bus resets..." patch. So the diffstat is wrong but...

>> Signed-off-by: Stefan Richter <stefanr@xxxxxxxxxxxxxxxxx>
>> ---
>> drivers/firewire/fw-device.c | 6 ++++++
>> drivers/firewire/fw-topology.c | 1 +
>> 2 files changed, 7 insertions(+)
>>
>> Index: linux/drivers/firewire/fw-device.c
>> ===================================================================
>> --- linux.orig/drivers/firewire/fw-device.c
>> +++ linux/drivers/firewire/fw-device.c
>> @@ -808,6 +813,7 @@ void fw_node_event(struct fw_card *card,
>>
>> device = node->data;
>> device->node_id = node->node_id;
>> + wmb();
>> device->generation = card->generation;
>> if (atomic_read(&device->state) == FW_DEVICE_RUNNING) {
>> PREPARE_DELAYED_WORK(&device->work, fw_device_update);
>> Index: linux/drivers/firewire/fw-topology.c
>> ===================================================================
>> --- linux.orig/drivers/firewire/fw-topology.c
>> +++ linux/drivers/firewire/fw-topology.c
>> @@ -518,6 +518,7 @@ fw_core_handle_bus_reset(struct fw_card
>> card->bm_retries = 0;
>>
>> card->node_id = node_id;
>> + wmb();
>> card->generation = generation;
>> card->reset_jiffies = jiffies;
>> schedule_delayed_work(&card->work, 0);

...the diff is what I intended. Anyway, maybe I should make it a habit
to send patches only between 10 AM and 10 PM local time. :-)
--
Stefan Richter
-=====-=-=== =-== ----=
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/