Re: [PATCH 1/1] hwmon: Driver for temperature sensors on SATA drives

From: Martin K. Petersen
Date: Tue Dec 17 2019 - 22:40:04 EST



Guenter,

> If there are 100 physical drives, you would actually want to see the
> temperature of each drive separately, as one of them might be
> overheating due to some internal failure.

Yep. However, for "big boxes" you'll typically get that information from
SAF-TE or SES enclosure services and not from the drive itself.

SES allows you to monitor power supplies, drive bays, hot swap events,
thermals, etc. We have a SES driver in SCSI that exposes all these
things in sysfs. It is not currently tied into hwmon.

> If the storage array is represented to the system as single huge
> physical drive, which is then split into logical entities not related
> to physical drives, I guess that would represent a problem for system
> management overall.

Yep. That's why there's dedicated plumbing in smartmontools to handle
various RAID controller interfaces for accessing physical drive
information. It's typically highly vendor-specific.

> I would not mind to tie the hardware monitoring device to something
> else than the scsi device if the scsi device does not always have a
> physical representation. Is there a way to determine if a scsi device
> is virtual or real ?

Not really. Target is usually a pretty good approximation, although some
arrays introduce virtual targets because of limited LUN (scsi_device)
numbering capabilities. However, arrays generally don't support per-LUN
temperature because it makes no sense.

I'm trying to gauge how much a pain potentially redundant sensors would
be for userland monitoring tooling vs. how many oddball devices we'd not
be able to support if we were to use scsi_target as parent (or restrict
the sensor binding to LUN 0).

--
Martin K. Petersen Oracle Linux Engineering