Re: [PATCH 00/31] Remove use of i2c_match_id in HWMON

From: Guenter Roeck
Date: Thu Jun 06 2024 - 13:17:51 EST


On 4/18/24 06:42, Guenter Roeck wrote:
On Tue, Apr 16, 2024 at 12:08:50PM -0500, Andrew Davis wrote:
On 4/16/24 9:16 AM, Guenter Roeck wrote:
On Mon, Apr 08, 2024 at 04:49:43AM -0700, Guenter Roeck wrote:
On Wed, Apr 03, 2024 at 05:06:43PM -0500, Andrew Davis wrote:
On 4/3/24 4:30 PM, Guenter Roeck wrote:
On Wed, Apr 03, 2024 at 03:36:02PM -0500, Andrew Davis wrote:
Hello all,

Goal here is to remove the i2c_match_id() function from all drivers.
Using i2c_get_match_data() can simplify code and has some other
benefits described in the patches.


The return value from i2c_match_id() is typically an integer (chip ID)
starting with 0. Previously it has been claimed that this would be
unacceptable for i2c_get_match_data(), and chip IDs were changed to start
with 1. Commit ac0c26bae662 ("hwmon: (lm25066) Use i2c_get_match_data()")
is an example. Either this series is wrong, or the previous claim that
chip IDs (i.e., the content of .driver_data or .data) must not be 0 was
wrong. Which one is it ? I find it very confusing that the chip type for
some drivers now starts with 1 and for others with 0. Given that, I am not
inclined to accept this series unless it is explained in detail why the
chip type enum in, for example, drivers/hwmon/pmbus/lm25066.c has to start
with one but is ok to start with 0 for all drivers affected by this
series. Quite frankly, even if there is some kind of explanation, I am not
sure if I am going to accept it because future driver developers won't
know if they have to start chip types with 0 or 1.


i2c_get_match_data() has no issue with returning 0 when the driver_data
for the match is also 0 (as it will be when the chip type is 0 here).

The confusion might be that returning 0 is also considered a failure code.
This is a problem in general with returning errors in-band with data, and
that is nothing new as i2c_match_id() does the same thing.

Actually, i2c_match_id() is worse as most of these drivers take the result
from that and immediately dereference it. Meaning if i2c_match_id() ever did
failed to find a match, they would crash before this series. Luckily i2c_match_id()
can't fail to find a match as far as I can tell, and so for the same reason
neither can i2c_get_match_data(), which means if 0 is returned it is always
because the chip ID was actually 0.

At some point we should switch all the *_get_match_data() functions to
return an error code and put the match if found as a argument pointer.
Forcing everyone to changing the chip type to avoid 0 as done in
ac0c26bae662 is the wrong way to fix an issue like that.


That doesn't really answer my question. It does not explain why it was
necessary to change the chip ID base for other drivers from 0 to 1,
but not for the drivers in this series. I fail to see the difference,
and I have to assume that others looking into the code will have the
same problem.


Just to follow up: I am not going to apply this series until I understand
why the chip ID range had to be changed from 0.. to 1.. for other hardware
monitoring drivers (lm25066, nct6775) but not for the drivers changed
in this series. I have been telling people that chip IDs need to start
with 1 if i2c_get_match_data() is used. I'll need understand when and
why this is needed to be able to provide guidance to other developers.


I was hoping one of those patch authors that made those 0->1 changes
would speak up (+Rob), I can't know what their thinking was, only
offer my best guess as I did above.


I can see three possibilities.

- Chip IDs must start with 1 if i2c_get_match_data() is used, as I was told
previously. If so, this series is wrong.
- It is ok for chip IDs to start with 0. If so, what I have been told
previously is wrong, and the patches changing chip IDs to start with 1
can and should be partially reverted to avoid confusion.
- Chip IDs must sometimes, but not always, start with 1. If so, the
conditions will have to be documented to help driver developers decide
the valid starting value and to be able to determine if all the patches
in this series follow the rules.

Someone will have to step up and explain to me which one it is.


Unfortunately, that step-in never happened. Instead, I keep getting patches
similar to this series. So, I applied the series, and I'll submit
patches to update all drivers to have the chip ID enum start with 0,
and/or to remove the complexity introduced under the assumption that it must
not be 0 / NULL.

Guenter