Re: [PATCH] hwmon: Fix checkpatch errors in lm90 driver

From: Guenter Roeck
Date: Fri Aug 27 2010 - 12:49:56 EST


On Fri, Aug 27, 2010 at 11:24:03AM -0400, Jean Delvare wrote:
Hi Jean,

> Hi Guenter,
>
> On Fri, 27 Aug 2010 06:49:26 -0700, Guenter Roeck wrote:
> > Next question: lm90_update_device() currently does not return any errors.
> > In recent drivers, we pass i2c read errors up to userland. Before I introduce
> > the max6696 changes, does it make sense to add error checking/return
> > into the driver, similar to what I have done in the smm665 and jc42 drivers ?
>
> So far, most hwmon driver authors decided to ignore such errors, or
> limited their handling to logging the issue, mainly because the caching
> mechanism makes handling of such errors tough. Now I admit that the
> approach you took in the jc42 driver is interesting. I never considered
> having a single error value being returned by the update function the
> way you did.
>
> This has the obvious drawback that transient I/O errors cause _all_
> sensor values to be unavailable, which is discussable, especially for a
> device with many features. It's hard to justify that all values of a
> full-featured hardware monitoring chip could be unavailable because,
> for example, one of the temperature sensors is unreliable. So this
> approach is fine for your small jc42 driver, but I don't think it can be
> generalized.
>
On the plus side, though, a transient failure only causes a single read
operation to fail, since I don't update the timestamp nor the valid flag
in the error case. As a result, the next read will again try to update
all values. So it isn't really that bad. Only real drawback of my approach
is that a transient read failure on one sensor register will likely be
reported while trying to read data for another sensor.

Of course, you are right that a permanent error on a single register will
cause all sensor read operations to fail, which isn't really desirable.
I have no idea if that can happen in the real world, though. Seems to be
unlikely that a failing sensor would cause an I2C operation failure.
But who knows - maybe it does happen with some chips.

> In the general case, I think I am fine with pretty much anything which
> doesn't plain ignore error codes (as many drivers still do...) and
> doesn't block all readings on transient errors. This can mean returning
> 0 on error, or returning the previous last known value (definitely
> acceptable for transient errors, but not so for long-standing ones),

Basic reason for returning errors in the first place was that I was asked
to do so in review feedback for one of my drivers - specifically, that I
should not drop errors. So we would need some clear(er) guidelines
for new drivers if we want to go along that path.

> with or without logging. Or if you really want to pass error codes down
> to user-space, I think you have to rework the update() function and the
> per-device data structure altogether, to be able to store error codes
> in the data structure.
>
Seems to be a bit excessive, and it doesn't seem to be worth the effort
and added complexity.

> A different (and complementary) approach is to repeat the failing
> command and see if it helps. The w83l785ts driver does exactly this. If
> we want to generalize this, it would probably make sense to implement
> it at the the i2c-core level (i.e. add a "retries" i2c_client
> attribute.)
>
Still doesn't solve the permanent error case, though. Question remains, then,
if it is likely that a single i2c register would return a permanent error
while others still work.

> I admit I have been ignoring the issue mainly so far, because it's not
> a big problem in practice (except on one board with the w83l785ts
> driver, thus the extra code in that driver), so adding complex or
> invasive code to deal with it isn't too appealing.
>
I'll take that as a hint and won't make any changes to lm90 driver error
handling.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/