Re: [PATCH v1 2/4] hwmon: (sfctemp) Add StarFive JH71x0 temperature sensor
From: Hal Feng
Date: Thu Feb 09 2023 - 01:21:03 EST
On Wed, 8 Feb 2023 07:16:52 -0800, Guenter Roeck wrote:
> On 2/8/23 04:40, Hal Feng wrote:
>> On Mon, 6 Feb 2023 11:21:38 -0800, Guenter Roeck wrote:
>>> On 2/6/23 09:12, Hal Feng wrote:
>>>> On Tue, 3 Jan 2023 14:10:17 -0800, Guenter Roeck wrote:
>>>>> On Tue, Jan 03, 2023 at 09:31:43AM +0800, Hal Feng wrote:
>> [...]
>>>>>> diff --git a/drivers/hwmon/sfctemp.c b/drivers/hwmon/sfctemp.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..e56716ad9587
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/hwmon/sfctemp.c
>>>>>> @@ -0,0 +1,350 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>> +/*
>>>>>> + * Copyright (C) 2021 Emil Renner Berthing <kernel@xxxxxxxx>
>>>>>> + * Copyright (C) 2021 Samin Guo <samin.guo@xxxxxxxxxxxxxxxx>
>>>>>> + */
>>>>>> +#include <linux/clk.h>
>>>>>> +#include <linux/completion.h>
>>>>>> +#include <linux/delay.h>
>>>>>> +#include <linux/hwmon.h>
>>>>>> +#include <linux/interrupt.h>
>>>>>> +#include <linux/io.h>
>>>>>> +#include <linux/module.h>
>>>>>> +#include <linux/mutex.h>
>>>>>> +#include <linux/of.h>
>>>>>> +#include <linux/platform_device.h>
>>>>>> +#include <linux/reset.h>
>>>>>> +
>>>>>> +/*
>>>>>> + * TempSensor reset. The RSTN can be de-asserted once the analog core has
>>>>>> + * powered up. Trst(min 100ns)
>>>>>> + * 0:reset 1:de-assert
>>>>>> + */
>>>>>> +#define SFCTEMP_RSTN BIT(0)
>>>>>
>>>>> Missing include of linux/bits.h
>>>>
>>>> Will add it. Thanks.
>>>>
>>>>>
>>>>>> +
>>>>>> +/*
>>>>>> + * TempSensor analog core power down. The analog core will be powered up
>>>>>> + * Tpu(min 50us) after PD is de-asserted. RSTN should be held low until the
>>>>>> + * analog core is powered up.
>>>>>> + * 0:power up 1:power down
>>>>>> + */
>>>>>> +#define SFCTEMP_PD BIT(1)
>>>>>> +
>>>>>> +/*
>>>>>> + * TempSensor start conversion enable.
>>>>>> + * 0:disable 1:enable
>>>>>> + */
>>>>>> +#define SFCTEMP_RUN BIT(2)
>>>>>> +
>>>>>> +/*
>>>>>> + * TempSensor conversion value output.
>>>>>> + * Temp(C)=DOUT*Y/4094 - K
>>>>>> + */
>>>>>> +#define SFCTEMP_DOUT_POS 16
>>>>>> +#define SFCTEMP_DOUT_MSK GENMASK(27, 16)
>>>>>> +
>>>>>> +/* DOUT to Celcius conversion constants */
>>>>>> +#define SFCTEMP_Y1000 237500L
>>>>>> +#define SFCTEMP_Z 4094L
>>>>>> +#define SFCTEMP_K1000 81100L
>>>>>> +
>>>>>> +struct sfctemp {
>>>>>> + /* serialize access to hardware register and enabled below */
>>>>>> + struct mutex lock;
>>>>>> + struct completion conversion_done;
>>>>>> + void __iomem *regs;
>>>>>> + struct clk *clk_sense;
>>>>>> + struct clk *clk_bus;
>>>>>> + struct reset_control *rst_sense;
>>>>>> + struct reset_control *rst_bus;
>>>>>> + bool enabled;
>>>>>> +};
>>>>>> +
>>>>>> +static irqreturn_t sfctemp_isr(int irq, void *data)
>>>>>> +{
>>>>>> + struct sfctemp *sfctemp = data;
>>>>>> +
>>>>>> + complete(&sfctemp->conversion_done);
>>>>>> + return IRQ_HANDLED;
>>>>>> +}
>>>>>> +
>>>>>> +static void sfctemp_power_up(struct sfctemp *sfctemp)
>>>>>> +{
>>>>>> + /* make sure we're powered down first */
>>>>>> + writel(SFCTEMP_PD, sfctemp->regs);
>>>>>> + udelay(1);
>>>>>> +
>>>>>> + writel(0, sfctemp->regs);
>>>>>> + /* wait t_pu(50us) + t_rst(100ns) */
>>>>>> + usleep_range(60, 200);
>>>>>> +
>>>>>> + /* de-assert reset */
>>>>>> + writel(SFCTEMP_RSTN, sfctemp->regs);
>>>>>> + udelay(1); /* wait t_su(500ps) */
>>>>>> +}
>>>>>> +
>>>>>> +static void sfctemp_power_down(struct sfctemp *sfctemp)
>>>>>> +{
>>>>>> + writel(SFCTEMP_PD, sfctemp->regs);
>>>>>> +}
>>>>>> +
>>>>>> +static void sfctemp_run_single(struct sfctemp *sfctemp)
>>>>>> +{
>>>>>> + writel(SFCTEMP_RSTN | SFCTEMP_RUN, sfctemp->regs);
>>>>>> + udelay(1);
>>>>>> + writel(SFCTEMP_RSTN, sfctemp->regs);
>>>>>
>>>>> The datasheet (or, rather, programming manual) does not appear
>>>>> to be public, so I have to guess here.
>>>>>
>>>>> The code suggests that running a single conversion may be a choice,
>>>>> not a requirement. If it is indeed a choice, the reasoning needs to be
>>>>> explained since it adds a lot of complexity and dependencies to the
>>>>> driver (for example, interrupt support is only mandatory or even needed
>>>>> due to this choice). It also adds a significant delay to temperature
>>>>> read operations, which may have practical impact on thermal control
>>>>> software.
>>>>>
>>>>> If the chip only supports single temperature readings, that needs to be
>>>>> explained as well (and why SFCTEMP_RUN has to be reset in that case).
>>>>
>>>> The chip supports continuous conversion. When you set SFCTEMP_RUN, the
>>>> temperature raw data will be generated all the time. However, it will
>>>> also generate interrupts all the time when the conversion is finished,
>>>> because of the hardware limitation. So in this driver, we just support
>>>> the single conversion.
>>>>
>>>
>>> Sorry, I don't follow the logic. The interrupt is, for all practical
>>> purposes, useless because there are no limits and exceeding any such
>>> limits is therefore not supported. The only reason to have and enable
>>> to interrupt is because continuous mode is disabled.
>>>
>>> The code could be simplified a lot if interrupt support would be
>>> dropped and continuous mode would be enabled.
>>
>> If we enable continuous mode, which means SFCTEMP_RUN remains asserted,
>> the conversion finished interrupt will be raised after each sample
>> time (8.192 ms). Within a few minutes, a lot of interrupts are raised,
>> as showed below.
>>
>> # cat /proc/interrupts
>> CPU0 CPU1 CPU2 CPU3
>> 1: 0 0 0 0 SiFive PLIC 1 Edge ccache_ecc
>> 2: 1 0 0 0 SiFive PLIC 3 Edge ccache_ecc
>> 3: 1 0 0 0 SiFive PLIC 4 Edge ccache_ecc
>> 4: 0 0 0 0 SiFive PLIC 2 Edge ccache_ecc
>> 5: 1116 1670 411 1466 RISC-V INTC 5 Edge riscv-timer
>> 6: 32093 0 0 0 SiFive PLIC 81 Edge 120e0000.temperature-sensor
>> 10: 1233 0 0 0 SiFive PLIC 32 Edge ttyS0
>> IPI0: 117 62 123 117 Rescheduling interrupts
>> IPI1: 278 353 105 273 Function call interrupts
>> IPI2: 0 0 0 0 CPU stop interrupts
>> IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
>> IPI4: 0 0 0 0 IRQ work interrupts
>> IPI5: 0 0 0 0 Timer broadcast interrupts
>>
>> If we enable continuous mode and drop the interrupt support in the
>> driver, the kernel will not know the interrupts but a lot of interrupts
>> are still raised in hardware. Can we do such like that?
>
> Why not ? It just stays raised. That happens a lot.
OK, I see. Thanks.
>
>> Without the interrupt support, the temperature we read may be the value
>> generated in the last cycle.
>
> That would be highly unusual and should be documented.
Because without the interrupt, there is no flag pointing out when the
conversion is finished, and the temperature we read is the data generated
before. With further consideration, the error is no more than one sample
time (8.192 ms) in continuous mode while the error is no less than one sample
time in single mode. In this respect, continuous mode is better than single
mode.
>
>
>>
>> I think the temperature has its value only when we read it, so we start
>
> "may be" ? "I think" ? That means you don't know ? Maybe test it, or ask
> the chip designers.
Sorry for my inexact statement. I means the temperature data is useful to us
only when we read it. If we don't read it, the temperature data will just be
discarded every sample time.
>
>> conversion only when we read the temperature. Further more, it will
>> consume more power if we enable continuous mode.
>>
>
> Usually that is not a concern, much less so than delaying each reader.
After some test, it's found that there is almost no difference between these
two modes in power consumption.
>
> Ultimately, sure, you can do whatever you want. I'll still accept the driver.
> I do expect you to explain your reasons (all of them) in the driver, though.
>
> If you don't _know_ if the temperature is updated in continuous mode,
> please state exactly that in the comments. Also explain how much power
> is saved by not running in continuous mode. I don't want anyone to come
> back later on and change the code because they don't know the reasons
> why it doesn't use continuous mode.
It's OK to run the continuous mode instead and remove the "- interrupts" entry
in "required:" of DT bindings. I will modify in the next version. Thank you
for your suggestions.
Best regards,
Hal