Re: [PATCH v4] nvme: Add hardware monitoring support

From: Guenter Roeck
Date: Tue Nov 05 2019 - 13:10:29 EST


On Wed, Nov 06, 2019 at 12:44:56AM +0900, Akinobu Mita wrote:
> 2019å11æ6æ(æ) 0:38 Akinobu Mita <akinobu.mita@xxxxxxxxx>:
> >
> > 2019å11æ2æ(å) 23:55 Guenter Roeck <linux@xxxxxxxxxxxx>:
> > > diff --git a/drivers/nvme/host/nvme-hwmon.c b/drivers/nvme/host/nvme-hwmon.c
> > > new file mode 100644
> > > index 000000000000..28b4b7f43bb0
> > > --- /dev/null
> > > +++ b/drivers/nvme/host/nvme-hwmon.c
> > > @@ -0,0 +1,181 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * NVM Express hardware monitoring support
> > > + * Copyright (c) 2019, Guenter Roeck
> > > + */
> > > +
> > > +#include <linux/hwmon.h>
> > > +#include <asm/unaligned.h>
> > > +
> > > +#include "nvme.h"
> > > +
> > > +struct nvme_hwmon_data {
> > > + struct nvme_ctrl *ctrl;
> > > + struct nvme_smart_log log;
> > > + struct mutex read_lock;
> > > +};
> > > +
> > > +static int nvme_hwmon_get_smart_log(struct nvme_hwmon_data *data)
> > > +{
> > > + int ret;
> > > +
> > > + ret = nvme_get_log(data->ctrl, NVME_NSID_ALL, NVME_LOG_SMART, 0,
> > > + &data->log, sizeof(data->log), 0);
> > > +
> > > + return ret <= 0 ? ret : -EIO;
> > > +}
> > > +
> > > +static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> > > + u32 attr, int channel, long *val)
> > > +{
> > > + struct nvme_hwmon_data *data = dev_get_drvdata(dev);
> > > + struct nvme_smart_log *log = &data->log;
> > > + int temp;
> > > + int err;
> > > +
> > > + /*
> > > + * First handle attributes which don't require us to read
> > > + * the smart log.
> > > + */
> > > + switch (attr) {
> > > + case hwmon_temp_max:
> > > + *val = (data->ctrl->wctemp - 273) * 1000;
> > > + return 0;
> > > + case hwmon_temp_crit:
> > > + *val = (data->ctrl->cctemp - 273) * 1000;
> >
> > This attribute should be 'hwmon_temp_max_alarm' rather than
> > 'hwmon_temp_crit_alarm'?
>

Yes, you are correct. Actually, re-reading the specification,
the bit may also be set for under-temperature alarms, so I'll
use the more generic hwmon_temp_alarm.

Thanks,
Guenter

> Oops, I misquoted the code.
>
> This comment should be addressed to the code below:
>
> + case hwmon_temp_crit_alarm:
> + *val = !!(log->critical_warning & NVME_SMART_CRIT_TEMPERATURE);
> + break;