Re: [PATCH 0/5] scsi: ufs: ufs device as a temperature sensor

From: Julian Calaby
Date: Thu Feb 06 2020 - 19:48:07 EST


Hi Avri,

On Fri, Feb 7, 2020 at 6:32 AM Avri Altman <Avri.Altman@xxxxxxx> wrote:
>
> Hi Julian,
>
> >
> >
> > Hi Avri,
> >
> > On Fri, Feb 7, 2020 at 12:41 AM Avri Altman <Avri.Altman@xxxxxxx> wrote:
> > >
> > > >
> > > > Hi Avri,
> > > >
> > > > On Thu, Feb 6, 2020 at 11:08 PM Avri Altman <Avri.Altman@xxxxxxx>
> > > > wrote:
> > > > >
> > > > >
> > > > > >
> > > > > > Hi Avi,
> > > > > >
> > > > > > On Thu, Feb 6, 2020 at 9:48 PM Avi Shchislowski
> > > > > > <Avi.Shchislowski@xxxxxxx> wrote:
> > > > > > >
> > > > > > > As it become evident that the hwmon is not a viable option to
> > > > implement
> > > > > > ufs thermal notification, I would appreciate some concrete comments
> > of
> > > > this
> > > > > > series.
> > > > > >
> > > > > > That isn't my reading of this thread.
> > > > > >
> > > > > > You have two options:
> > > > > > 1. extend drivetemp if that makes sense for this particular application.
> > > > > > 2. follow the model of other devices that happen to have a built-in
> > > > > > temperature sensor and expose the hwmon compatible attributes as
> > a
> > > > > > subdevice
> > > > > >
> > > > > > It appears that option 1 isn't viable, so what about option 2?
> > > > > This will require to export the ufs device management commands,
> > > > > Which is privet to the ufs driver.
> > > > >
> > > > > This is not a viable option as well, because it will allow unrestricted
> > access
> > > > > (Including format etc.) to the storage device.
> > > > >
> > > > > Sorry for not making it clearer before.
> > > >
> > > > I should have clarified further: I meant having the UFS device
> > > > register a HWMON driver using this API:
> > > > https://www.kernel.org/doc/html/latest/hwmon/hwmon-kernel-
> > api.html
> > > >
> > > > *Not* writing a separate HWMON driver that uses some private
> > interface.
> > > Ok.
> > > Just one last question:
> > > The ufs spec requires to be able to react upon an exception event from the
> > device.
> > > The thermal core provides an api in the form of
> > thermal_notify_framework().
> > > What would be the hwmon equivalent for that?
> >
> > My understanding is that HWMON is just a standardised way to report
> > hardware sensor data to userspace. There are "alarm" files that can be
> > used to report fault conditions, so any action taken would have to be
> > either managed by userspace or configured using thermal zones
> > configured in the hardware's devicetree.
> Those "alarms" are implemented as part of the modules under drivers/hwmon/ isn't it?
> We already established that this is not an option for the ufs driver.

The HWMON API I pointed you to is a way for _any_ driver to implement
the necessary bits and pieces to report temperatures, alarms, etc.

It is _not_ restricted to modules under drivers/hwmon/ - you can
implement all parts of the HWMON interface from any driver anywhere.

Which means that all that is required to report alarms is to simply
implement support for that particular type of file.

> > thermal_notify_framework() is a way to notify the "other side" of a
> > thermal zone to do something to reduce the temperature of that zone.
> > E.g. spin up a fan or switch to a lower-power state to cool a CPU.
> > Looking at your code, you're only implementing the "sensor" side of
> > the thermal zone functionality, so your calls to
> > thermal_notify_framework() won't do anything.
> Right. The thermal core allows to react to such notifications,
> Provided that the thermal zone device has a governor defined,
> And/or notify ops etc.

Yes, but you don't define a cooling device, so there's nothing to
react to those notifications.

> Should the current patches implement those callbacks or not,
> Can be discussed during their review process.
> But the important thing is that the thermal core support it in an intuitive and simple way,
> While the hwmon doesn't.
>
> We are indifference with respect of which subsystem to use.
> The thermal core was our first choice because we bumped into it,
> Looking for a way to raise thermal exceptions.
> It provides the functionality we need, and other devices uses it,
> Why the insistence not to use it?

Other devices using the thermal zone subsystem implement both "sides"
of the zone: a sensor that monitors the device and cooling device(s)
which should be able to react to that. E.g. lowering the clock speed
of a CPU, slowing the transmission speed of a WiFi card, etc.

Your implementation doesn't do that.

Thanks,

--
Julian Calaby

Email: julian.calaby@xxxxxxxxx
Profile: http://www.google.com/profiles/julian.calaby/