Re: [PATCH] hwmon: pmbus: Make reg check and clear faults functions return errors

From: Guenter Roeck
Date: Thu Sep 07 2017 - 22:17:19 EST


On 09/07/2017 07:06 PM, Andrew Jeffery wrote:
On Thu, 2017-09-07 at 18:26 -0700, Guenter Roeck wrote:
On 09/07/2017 06:02 PM, Andrew Jeffery wrote:
On Thu, 2017-09-07 at 17:27 -0700, Guenter Roeck wrote:
On 09/07/2017 08:22 AM, Andrew Jeffery wrote:
On Thu, 2017-09-07 at 06:40 -0700, Guenter Roeck wrote:
On 09/06/2017 04:32 PM, Andrew Jeffery wrote:

Guess I need to dig up my eval board and see if I can reproduce the problem.
Seems you are saying that the problem is always seen when issuing a sequence
of "clear faults" commands on multiple pages ?

Yeah. We're also seeing bad behaviour under other command sequences as well,
which lead to this hack of a work-around patch[1].

I'd be very interested in the results of testing against the eval board. I
don't have access to one and it seems Maxim have discontinued them.


Do you have a somewhat reliable means to reproduce the problem ?

It seems we hit a bunch of problems by just continually
binding/unbinding the driver, if you don't apply that hacky oneshot
retry patch. We can hit problems (in our design?) with something like:

# cd /sys/bus/i2c/drivers/max31785; \
echo $addr > unbind; \
while echo $addr > bind; \
do echo $addr > unbind; echo -n .; done;

It should hit issues covered by this patch, as the register checks are
used in the operations used by probe.


Hmm ... I didn't use your driver but my prototype driver which also supports
temperature and voltage attributes, so if anything it should create more
stress on the chip.

I did add the temp and voltage attributes...

Any chance you can give mine a try? I don't know what I would have done
to invoke this kind of behaviour, so it would be useful to know whether
or not it happens with one driver but not the other.


Will do.

Thanks. For reference, here's a devicetree description:

https://github.com/openbmc/linux/blob/dev-4.10/arch/arm/boot/dts/aspeed-bmc-opp-witherspoon.dts#L283


I can't test with devicetree. x86 system.

2,100+ iterations with your driver, no failures.

Either it is because my chip is a MAX31785 (not A), or the configuration makes a difference,
or it is your hardware.

I'll try to connect a couple of fans next (so far I did without) and try again.

Guenter