Re: [PATCH v2 3/4] firewire: core: Prevent device_find_child() from modifying caller's match data
From: Zijun Hu
Date: Mon Aug 19 2024 - 07:41:39 EST
On 2024/8/19 16:58, Takashi Sakamoto wrote:
>
> Hi,
>
> On 2024/8/18 22:34, Zijun Hu wrote:
>> On 2024/8/17 17:57, Takashi Sakamoto wrote:
>>> ======== 8< --------
>>>
>>> From ceaa8a986ae07865eb3fec810de330e96b6d56e2 Mon Sep 17 00:00:00 2001
>>> From: Takashi Sakamoto <o-takashi@xxxxxxxxxxxxx>
>>> Date: Sat, 17 Aug 2024 17:52:53 +0900
>>> Subject: [PATCH] firewire: core: update fw_device outside of
>>> device_find_child()
>>>
>>> When detecting updates of bus topology, the data of fw_device is newly
>>> allocated and caches the content of configuration ROM from the
>>> corresponding node. Then, the tree of device is sought to find the
>>> previous data of fw_device corresponding to the node, since in IEEE 1394
>>> specification numeric node identifier could be changed dynamically every
>>> generation of bus topology. If it is found, the previous data is updated
>>> and reused, then the newly allocated data is going to be released.
>>>
>>> The above procedure is done in the call of device_find_child(), however it
>>> is a bit abusing against the intention of the helper function, since the
>>> call would not only find but also update.
>>>
>>> This commit splits the update outside of the call.
>>> ---
>>> drivers/firewire/core-device.c | 109 ++++++++++++++++-----------------
>>> 1 file changed, 54 insertions(+), 55 deletions(-)
>>>
>>> diff --git a/drivers/firewire/core-device.c b/drivers/firewire/core-device.c
>>> index bc4c9e5a..62e8d839 100644
>>> --- a/drivers/firewire/core-device.c
>>> +++ b/drivers/firewire/core-device.c
>>> ...
>>> @@ -1038,6 +988,17 @@ int fw_device_set_broadcast_channel(struct device *dev, void *gen)
>>> return 0;
>>> }
>>>
>>> +static int compare_configuration_rom(struct device *dev, void *data)
>>> +{
>>> + const struct fw_device *old = fw_device(dev);
>>> + const u32 *config_rom = data;
>>> +
>>> + if (!is_fw_device(dev))
>>> + return 0;
>>> +
>>> + return !!memcmp(old->config_rom, config_rom, 6 * 4);
>>
>> !memcmp(old->config_rom, config_rom, 6 * 4) ?
>
> Indeed.
>
>> is this extra condition old->state == FW_DEVICE_GONE required ?
>>
>> namely, is it okay for below return ?
>> return !memcmp(old->config_rom, config_rom, 6 * 4) && old->state ==
>> FW_DEVICE_GONE
>
> If so, atomic_read() should be used, however I avoid it since the access
> to state member happens twice in in the path to reuse the instance.
>
it sounds good to not append the extra condition.
>>> +}
>>> +
>>> static void fw_device_init(struct work_struct *work)
>>> {
>>> struct fw_device *device =
>>> @@ -1071,13 +1032,51 @@ static void fw_device_init(struct work_struct *work)
>>> return;
>>> }
>>>
>>> - revived_dev = device_find_child(card->device,
>>> - device, lookup_existing_device);
>>> + // If a device was pending for deletion because its node went away but its bus info block
>>> + // and root directory header matches that of a newly discovered device, revive the
>>> + // existing fw_device. The newly allocated fw_device becomes obsolete instead.
>>> + //
>>> + // serialize config_rom access.
>>> + scoped_guard(rwsem_read, &fw_device_rwsem) {
>>> + // TODO: The cast to 'void *' could be removed if Zijun Hu's work goes well.
>>
>> may remove this TODO line since i will simply remove the cast with the
>> other patch series as shown below:
>> https://lore.kernel.org/all/20240811-const_dfc_done-v1-0-9d85e3f943cb@xxxxxxxxxxx/
>
> Of course, I won't apply this patch as is. It is just a mark to hold
> your attention.
>
>>> + revived_dev = device_find_child(card->device, (void *)device->config_rom,
>>> + compare_configuration_rom);
>>> + }
>>> if (revived_dev) {
>>> - put_device(revived_dev);
>>> - fw_device_release(&device->device);
>>> + struct fw_device *found = fw_device(revived_dev);
>>>
>>> - return;
>>> + // serialize node access
>>> + guard(spinlock_irq)(&card->lock);
>>> +
>>> + if (atomic_cmpxchg(&found->state,
>>> + FW_DEVICE_GONE,
>>> + FW_DEVICE_RUNNING) == FW_DEVICE_GONE) {
>>> + struct fw_node *current_node = device->node;
>>> + struct fw_node *obsolete_node = found->node;
>>> +
>>> + device->node = obsolete_node;
>>> + device->node->data = device;
>>> + found->node = current_node;
>>> + found->node->data = found;
>>> +
>>> + found->max_speed = device->max_speed;
>>> + found->node_id = current_node->node_id;
>>> + smp_wmb(); /* update node_id before generation */
>>> + found->generation = card->generation;
>>> + found->config_rom_retries = 0;
>>> + fw_notice(card, "rediscovered device %s\n", dev_name(revived_dev));
>>> +
>>> + found->workfn = fw_device_update;
>>> + fw_schedule_device_work(found, 0);
>>> +
>>> + if (current_node == card->root_node)
>>> + fw_schedule_bm_work(card, 0);
>>> +
>>> + put_device(revived_dev);
>>> + fw_device_release(&device->device);
>>> +
>>> + return;
>>> + }
>>
>> is it okay to put_device() here as well ?
>> put_device(revived_dev);
>
> Exactly. The call of put_device() should be done when the call of
> device_find_child() returns non-NULL value.
>
> Additionally, I realize that the call of fw_device_release() under
> acquiring card->lock causes dead lock.
>
>>> }
>>>
>>> device_initialize(&device->device);
>
> Anyway, I'll post take 2 and work for its evaluation.
>
great
>
> Thanks
>
> Takashi Sakamoto