Re: [PATCH v1 1/4] hwmon: (acpi_power_meter) Fix using uninitialized variables

From: Guenter Roeck
Date: Wed Dec 18 2024 - 22:50:26 EST


On 12/18/24 19:45, lihuisong (C) wrote:

在 2024/12/12 11:00, lihuisong (C) 写道:

在 2024/12/12 9:51, Guenter Roeck 写道:
On 11/26/24 19:43, lihuisong (C) wrote:
Hi Guenter,

How about the modification as below? But driver doesn't know what the time is to set resource->power_alarm to false.

It's a start, but incomplete because power_alarm must be reset.

See below.

在 2024/11/27 0:19, Guenter Roeck 写道:
On 11/25/24 23:03, lihuisong (C) wrote:

在 2024/11/26 12:04, Guenter Roeck 写道:
On 11/25/24 17:56, lihuisong (C) wrote:
Hi Guente,

Thanks for your timely review.

在 2024/11/26 0:03, Guenter Roeck 写道:
On 11/25/24 01:34, Huisong Li wrote:
The 'power1_alarm' attribute uses the 'power' and 'cap' in the
acpi_power_meter_resource structure. However, these two fields are just
updated when user query 'power' and 'cap' attribute, or hardware enforced
limit. If user directly query the 'power1_alarm' attribute without queryng
above two attributes, driver will use the uninitialized variables to judge.
In addition, the 'power1_alarm' attribute needs to update power and cap to
show the real state.

Signed-off-by: Huisong Li <lihuisong@xxxxxxxxxx>
---
  drivers/hwmon/acpi_power_meter.c | 10 ++++++++++
  1 file changed, 10 insertions(+)

diff --git a/drivers/hwmon/acpi_power_meter.c b/drivers/hwmon/acpi_power_meter.c
index 2f1c9d97ad21..4c3314e35d30 100644
--- a/drivers/hwmon/acpi_power_meter.c
+++ b/drivers/hwmon/acpi_power_meter.c
@@ -396,6 +396,9 @@ static ssize_t show_val(struct device *dev,
      struct acpi_device *acpi_dev = to_acpi_device(dev);
      struct acpi_power_meter_resource *resource = acpi_dev->driver_data;
      u64 val = 0;
+    int ret;
+
+    guard(mutex)(&resource->lock);
        switch (attr->index) {
      case 0:
@@ -423,6 +426,13 @@ static ssize_t show_val(struct device *dev,
              val = 0;
          break;
      case 6:
+        ret = update_meter(resource);
+        if (ret)
+            return ret;
+        ret = update_cap(resource);
+        if (ret)
+            return ret;
+
          if (resource->power > resource->cap)
              val = 1;
          else


While technically correct, the implementation of this attribute defeats its
purpose. It is supposed to reflect the current status as reported by the
hardware. A real fix would be to use the associated notification to set or
reset a status flag, and to report the current value of that flag as reported
by the hardware.
I know what you mean.
The Notify(power_meter, 0x83) is supposed to meet your proposal IIUC.
It's good, but it depands on hardware support notification.

If there is no notification support, the attribute should not even exist,
unless there is a means to retrieve its value from ACPI (the status itself,
not by comparing temperature values).
Currently, the 'power1_alarm' attribute is created just when platform support the power meter meassurement(bit0 of the supported capabilities in _PMC).
And it doesn't see if the platform support notifications.
 From the current implementation of this driver, this sysfs can also reflect the status by comparing power and cap,
which is good to the platform that support hardware limit from some out-of-band mechanism but doesn't support any notification.


The point is that this can also be done from userspace. Hardware monitoring drivers
are supposed to provide hardware attributes, not software attributes derived from it.

So this 'power1_alarm' attribute can be exposed when platform supports hardware enforced limit and notifcations when the hardware limit is enforced, right?
If so, we have to change the condition that driver creates this sysfs interface.

This isn't about enforcing anything, it is about reporting an alarm
if the power consumed exceeds the maximum configured.

-->

index 2f1c9d97ad21..b436ebd863e6
--- a/drivers/hwmon/acpi_power_meter.c
+++ b/drivers/hwmon/acpi_power_meter.c
@@ -84,6 +84,7 @@ struct acpi_power_meter_resource {
         u64             power;
         u64             cap;
         u64             avg_interval;
+       bool            power_alarm;
         int                     sensors_valid;
         unsigned long           sensors_last_updated;
         struct sensor_device_attribute  sensors[NUM_SENSORS];
@@ -396,6 +397,9 @@ static ssize_t show_val(struct device *dev,
         struct acpi_device *acpi_dev = to_acpi_device(dev);
         struct acpi_power_meter_resource *resource = acpi_dev->driver_data;
         u64 val = 0;
+       int ret;
+
+       guard(mutex)(&resource->lock);

         switch (attr->index) {
         case 0:
@@ -423,10 +427,21 @@ static ssize_t show_val(struct device *dev,
                         val = 0;
                 break;
         case 6:
-               if (resource->power > resource->cap)
-                       val = 1;
-               else
-                       val = 0;
+               /* report alarm status based on the notification if support. */
+               if (resource->caps.flags & POWER_METER_CAN_NOTIFY) {
+                       val = resource->power_alarm;
+               } else {
+                       ret = update_meter(resource);
+                       if (ret)
+                               return ret;
+                       ret = update_cap(resource);
+                       if (ret)
+                               return ret;
+                       if (resource->power > resource->cap)
+                               val = 1;
+                       else
+                               val = 0;
+               }

It would have to be something like

        ret = update_meter(resource);
        if (ret)
            return ret;

        val = resource->power_alarm || resource->power > resource->cap;
        /* clear alarm if no longer active */
        resource->power_alarm &= resource->power > resource->cap;

This ensures that alarms are cached if supported, and that cached values are
reported at once. It is far from perfect but the best I can think of since
there is no notification that the alarm is cleared.

Indeed, since there is no notification that the alarm is cleared, driver have to compare 'power' and 'cap' to clear it anyway.
If platform support notify to OSPM, driver also need to update 'power' to show this alarm status.
In this case, no need to update 'cap' which can be updated by nofity 0x82 event, right? But this also depands on the initialization of the "resource->cap" the probe phase needs to add.
For the platform doesn't support notify, driver have to update 'cap' and 'power' to show this status, right?

But considering above two cases, directly to update 'power' and 'cap' is simple to handle this without more switch case.
what do you think, Guenter?

Hi Guenter,

What do you think? Looking forward to your reply.😁


This is getting too complicated for a reply after a casual glance at the driver,
and I don't currently have time for a deeper look into the driver, sorry.

Guenter

/Huisong Li


                 break;
         case 7:
         case 8:
@@ -853,6 +868,7 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
                 sysfs_notify(&device->dev.kobj, NULL, POWER_AVG_INTERVAL_NAME);
                 break;
         case METER_NOTIFY_CAPPING:
+               resource->power_alarm = true;
                 sysfs_notify(&device->dev.kobj, NULL, POWER_ALARM_NAME);
                 dev_info(&device->dev, "Capping in progress.\n");
                 break;

.



.

.