Re: [PATCH v8 2/4] drivers: hwmon: sophgo: Add SG2042 external hardware monitor support

From: Chen Wang
Date: Wed Jul 31 2024 - 02:15:10 EST



On 2024/7/30 15:50, Inochi Amaoto wrote:
[......]
+#define REG_CRITICAL_ACTIONS 0x65
The name "REG_CRITICAL_ACTIONS" is ambiguous. I have confirmed with sophgo engineers that the complete process is: when the measured temperature exceeds the temperature set by REG_CRITICAL_TEMP, the processor is powered off and shut down, and then after the temperature returns to the temperature set by REG_REPOWER_TEMP, it is decided whether to power on again or remain in the shutdown state based on the action set by REG_CRITICAL_ACTIONS, whether it is reboot or poweroff.

So based on the above description, I think it would be better to call "REG_CRITICAL_ACTIONS" as "REG_REPOWER_ACTIONS". "REG_CRITICAL_ACTIONS" gives people the first impression that it is used to set actions related to REG_CRITICAL_TEMP.

It is also recommended to add the above description of temperature control and action settings in the code. Currently, sophgo does not have a clear document description for this part, and adding it will help us understand its functions.

Adding sophgo engineers Chunzhi and Haijiao, FYI.

+#define REG_CRITICAL_TEMP 0x66
+#define REG_REPOWER_TEMP 0x67
+
+#define CRITICAL_ACTION_REBOOT 1
+#define CRITICAL_ACTION_POWEROFF 2

As I said upon, actions are not related to critical, but is for restoring from critical, suggest to give a better name.

[......]

+static ssize_t critical_action_show(struct device *dev,
[......]
+static ssize_t critical_action_store(struct device *dev,

[......]

The same reason as upon, "critical_action_xxx" is misleading.

[......]

+static int sg2042_mcu_read_temp(struct device *dev,
+ u32 attr, int channel,
+ long *val)
+{
+ struct sg2042_mcu_data *mcu = dev_get_drvdata(dev);
+ int tmp;
+ u8 reg;
+
+ switch (attr) {
+ case hwmon_temp_input:
+ reg = channel ? REG_BOARD_TEMP : REG_SOC_TEMP;
+ break;
+ case hwmon_temp_crit:
+ reg = REG_CRITICAL_TEMP;
+ break;
+ case hwmon_temp_crit_hyst:
+ reg = REG_REPOWER_TEMP;
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ tmp = i2c_smbus_read_byte_data(mcu->client, reg);
+ if (tmp < 0)
+ return tmp;
+ *val = tmp * 1000;
+
+ return 0;
+}
+
+static int sg2042_mcu_read(struct device *dev,
+ enum hwmon_sensor_types type,
+ u32 attr, int channel, long *val)
+{
+ return sg2042_mcu_read_temp(dev, attr, channel, val);
+}
Can we merge sg2042_mcu_read and sg2042_mcu_read_temp?
+
+static int sg2042_mcu_write(struct device *dev,
+ enum hwmon_sensor_types type,
+ u32 attr, int channel, long val)
+{
+ struct sg2042_mcu_data *mcu = dev_get_drvdata(dev);
+ int temp = val / 1000;
+ int hyst_temp, crit_temp;
+ int ret;
+ u8 reg;
+
+ if (temp > MCU_POWER_MAX)
+ temp = MCU_POWER_MAX;
+
+ mutex_lock(&mcu->mutex);
+
+ switch (attr) {
+ case hwmon_temp_crit:
+ hyst_temp = i2c_smbus_read_byte_data(mcu->client,
+ REG_REPOWER_TEMP);
+ if (hyst_temp < 0) {
+ ret = -ENODEV;
+ goto failed;
+ }
+
+ crit_temp = temp;
+ reg = REG_CRITICAL_TEMP;
+ break;
+ case hwmon_temp_crit_hyst:
+ crit_temp = i2c_smbus_read_byte_data(mcu->client,
+ REG_CRITICAL_TEMP);
+ if (crit_temp < 0) {
+ ret = -ENODEV;
+ goto failed;
+ }
+
+ hyst_temp = temp;
+ reg = REG_REPOWER_TEMP;
+ break;
+ default:
+ mutex_unlock(&mcu->mutex);
+ return -EOPNOTSUPP;
+ }
+
It is recommended to add some comments to explain why we need to ensure that crit_temp is greater than or equal to hyst_temp. This is entirely because the current MCU does not limit the input, which may cause user to set incorrect crit_temp and hyst_temp.
+ if (crit_temp < hyst_temp) {
+ ret = -EINVAL;
+ goto failed;
+ }
+
+ ret = i2c_smbus_write_byte_data(mcu->client, reg, temp);
+
+failed:
+ mutex_unlock(&mcu->mutex);
+ return ret;
+}
+
[......]