Re: [PATCH v8 2/4] drivers: hwmon: sophgo: Add SG2042 external hardware monitor support

From: Inochi Amaoto
Date: Wed Jul 31 2024 - 03:18:44 EST


On Wed, Jul 31, 2024 at 02:13:20PM GMT, Chen Wang wrote:
>
> On 2024/7/30 15:50, Inochi Amaoto wrote:
> [......]
> > +#define REG_CRITICAL_ACTIONS 0x65
> The name "REG_CRITICAL_ACTIONS" is ambiguous. I have confirmed with sophgo
> engineers that the complete process is: when the measured temperature
> exceeds the temperature set by REG_CRITICAL_TEMP, the processor is powered
> off and shut down, and then after the temperature returns to the temperature
> set by REG_REPOWER_TEMP, it is decided whether to power on again or remain
> in the shutdown state based on the action set by REG_CRITICAL_ACTIONS,
> whether it is reboot or poweroff.
>
> So based on the above description, I think it would be better to
> call "REG_CRITICAL_ACTIONS" as "REG_REPOWER_ACTIONS". "REG_CRITICAL_ACTIONS"
> gives people the first impression that it is used to set actions related to
> REG_CRITICAL_TEMP.
>
> It is also recommended to add the above description of temperature control
> and action settings in the code. Currently, sophgo does not have a clear
> document description for this part, and adding it will help us understand
> its functions.
>
> Adding sophgo engineers Chunzhi and Haijiao, FYI.
>
> > +#define REG_CRITICAL_TEMP 0x66
> > +#define REG_REPOWER_TEMP 0x67
> > +
> > +#define CRITICAL_ACTION_REBOOT 1
> > +#define CRITICAL_ACTION_POWEROFF 2
>
> As I said upon, actions are not related to critical, but is for restoring
> from critical, suggest to give a better name.
>
> [......]
>
> > +static ssize_t critical_action_show(struct device *dev,
> [......]
> > +static ssize_t critical_action_store(struct device *dev,
>
> [......]
>
> The same reason as upon, "critical_action_xxx" is misleading.
>
> [......]
>

Thanks for explanation, I just get the name from the driver of SG2042.
This is out of my knowledge.

> > +static int sg2042_mcu_read_temp(struct device *dev,
> > + u32 attr, int channel,
> > + long *val)
> > +{
> > + struct sg2042_mcu_data *mcu = dev_get_drvdata(dev);
> > + int tmp;
> > + u8 reg;
> > +
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + reg = channel ? REG_BOARD_TEMP : REG_SOC_TEMP;
> > + break;
> > + case hwmon_temp_crit:
> > + reg = REG_CRITICAL_TEMP;
> > + break;
> > + case hwmon_temp_crit_hyst:
> > + reg = REG_REPOWER_TEMP;
> > + break;
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +
> > + tmp = i2c_smbus_read_byte_data(mcu->client, reg);
> > + if (tmp < 0)
> > + return tmp;
> > + *val = tmp * 1000;
> > +
> > + return 0;
> > +}
> > +
> > +static int sg2042_mcu_read(struct device *dev,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel, long *val)
> > +{
> > + return sg2042_mcu_read_temp(dev, attr, channel, val);
> > +}
> Can we merge sg2042_mcu_read and sg2042_mcu_read_temp?

Yes, it can be merged. but I think using this nested function
is more clear. And gcc can auto inline this function so we
got no performance penalty.

> > +
> > +static int sg2042_mcu_write(struct device *dev,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel, long val)
> > +{
> > + struct sg2042_mcu_data *mcu = dev_get_drvdata(dev);
> > + int temp = val / 1000;
> > + int hyst_temp, crit_temp;
> > + int ret;
> > + u8 reg;
> > +
> > + if (temp > MCU_POWER_MAX)
> > + temp = MCU_POWER_MAX;
> > +
> > + mutex_lock(&mcu->mutex);
> > +
> > + switch (attr) {
> > + case hwmon_temp_crit:
> > + hyst_temp = i2c_smbus_read_byte_data(mcu->client,
> > + REG_REPOWER_TEMP);
> > + if (hyst_temp < 0) {
> > + ret = -ENODEV;
> > + goto failed;
> > + }
> > +
> > + crit_temp = temp;
> > + reg = REG_CRITICAL_TEMP;
> > + break;
> > + case hwmon_temp_crit_hyst:
> > + crit_temp = i2c_smbus_read_byte_data(mcu->client,
> > + REG_CRITICAL_TEMP);
> > + if (crit_temp < 0) {
> > + ret = -ENODEV;
> > + goto failed;
> > + }
> > +
> > + hyst_temp = temp;
> > + reg = REG_REPOWER_TEMP;
> > + break;
> > + default:
> > + mutex_unlock(&mcu->mutex);
> > + return -EOPNOTSUPP;
> > + }
> > +
> It is recommended to add some comments to explain why we need to ensure that
> crit_temp is greater than or equal to hyst_temp. This is entirely because
> the current MCU does not limit the input, which may cause user to set
> incorrect crit_temp and hyst_temp.

Yeah, this is good idea.

> > + if (crit_temp < hyst_temp) {
> > + ret = -EINVAL;
> > + goto failed;
> > + }
> > +
> > + ret = i2c_smbus_write_byte_data(mcu->client, reg, temp);
> > +
> > +failed:
> > + mutex_unlock(&mcu->mutex);
> > + return ret;
> > +}
> > +
> [......]