Re: [PATCH v2 1/8] [PATCH 1/8] drivers/peci: Add support for PECI bus driver core

From: Jae Hyun Yoo
Date: Wed Mar 07 2018 - 14:03:33 EST


Hi Julia,

Thanks for sharing your time on reviewing it. Please see my inline answers.

Jae

On 3/6/2018 7:19 PM, Julia Cartwright wrote:
On Wed, Feb 21, 2018 at 08:15:59AM -0800, Jae Hyun Yoo wrote:
This commit adds driver implementation for PECI bus into linux
driver framework.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@xxxxxxxxxxxxxxx>
---
[..]
+static int peci_locked_xfer(struct peci_adapter *adapter,
+ struct peci_xfer_msg *msg,
+ bool do_retry,
+ bool has_aw_fcs)

_locked generally means that this function is invoked with some critical
lock held, what lock does the caller need to acquire before invoking
this function?


I intended to show that this function has a mutex locking inside for serialization of PECI data transactions from multiple callers, but as you commented out below, the mutex protection scope should be adjusted to make that covers the peci_scan_cmd_mask() function too. I'll rewrite the mutex protection scope then this function will be in the locked scope.

+{
+ ktime_t start, end;
+ s64 elapsed_ms;
+ int rc = 0;
+
+ if (!adapter->xfer) {

Is this really an optional feature of an adapter? If this is not
optional, then this check should be in place when the adapter is
registered, not here. (And it should WARN_ON(), because it's a driver
developer error).


I agree with you. I'll move this code into the peci_register_adapter() function.

+ dev_dbg(&adapter->dev, "PECI level transfers not supported\n");
+ return -ENODEV;
+ }
+
+ if (in_atomic() || irqs_disabled()) {

As Andrew mentioned, this is broken.

You don't even need a might_sleep(). The locking functions you use here
will already include a might_sleep() w/ CONFIG_DEBUG_ATOMIC_SLEEP.


Thanks for letting me know that. I'll drop that checking code and might_sleep() too.

+ rt_mutex_trylock(&adapter->bus_lock);
+ if (!rc)
+ return -EAGAIN; /* PECI activity is ongoing */
+ } else {
+ rt_mutex_lock(&adapter->bus_lock);
+ }
+
+ if (do_retry)
+ start = ktime_get();
+
+ do {
+ rc = adapter->xfer(adapter, msg);
+
+ if (!do_retry)
+ break;
+
+ /* Per the PECI spec, need to retry commands that return 0x8x */
+ if (!(!rc && ((msg->rx_buf[0] & DEV_PECI_CC_RETRY_ERR_MASK) ==
+ DEV_PECI_CC_TIMEOUT)))
+ break;

This is pretty difficult to parse. Can you split it into two different
conditions?


Sure. I'll split it out.

+
+ /* Set the retry bit to indicate a retry attempt */
+ msg->tx_buf[1] |= DEV_PECI_RETRY_BIT;

Are you sure this bit is to be set in the _second_ byte of tx_buf?


Yes, I'm pretty sure. The first byte contains a PECI command value and the second byte contains 'HostID[7:1] & Retry[0]' value.

+
+ /* Recalculate the AW FCS if it has one */
+ if (has_aw_fcs)
+ msg->tx_buf[msg->tx_len - 1] = 0x80 ^
+ peci_aw_fcs((u8 *)msg,
+ 2 + msg->tx_len);
+
+ /* Retry for at least 250ms before returning an error */
+ end = ktime_get();
+ elapsed_ms = ktime_to_ms(ktime_sub(end, start));
+ if (elapsed_ms >= DEV_PECI_RETRY_TIME_MS) {
+ dev_dbg(&adapter->dev, "Timeout retrying xfer!\n");
+ break;
+ }
+ } while (true);
+
+ rt_mutex_unlock(&adapter->bus_lock);
+
+ return rc;
+}
+
+static int peci_xfer(struct peci_adapter *adapter, struct peci_xfer_msg *msg)
+{
+ return peci_locked_xfer(adapter, msg, false, false);
+}
+
+static int peci_xfer_with_retries(struct peci_adapter *adapter,
+ struct peci_xfer_msg *msg,
+ bool has_aw_fcs)
+{
+ return peci_locked_xfer(adapter, msg, true, has_aw_fcs);
+}
+
+static int peci_scan_cmd_mask(struct peci_adapter *adapter)
+{
+ struct peci_xfer_msg msg;
+ u32 dib;
+ int rc = 0;
+
+ /* Update command mask just once */
+ if (adapter->cmd_mask & BIT(PECI_CMD_PING))
+ return 0;
+
+ msg.addr = PECI_BASE_ADDR;
+ msg.tx_len = GET_DIB_WR_LEN;
+ msg.rx_len = GET_DIB_RD_LEN;
+ msg.tx_buf[0] = GET_DIB_PECI_CMD;
+
+ rc = peci_xfer(adapter, &msg);
+ if (rc < 0) {
+ dev_dbg(&adapter->dev, "PECI xfer error, rc : %d\n", rc);
+ return rc;
+ }
+
+ dib = msg.rx_buf[0] | (msg.rx_buf[1] << 8) |
+ (msg.rx_buf[2] << 16) | (msg.rx_buf[3] << 24);
+
+ /* Check special case for Get DIB command */
+ if (dib == 0x00) {
+ dev_dbg(&adapter->dev, "DIB read as 0x00\n");
+ return -1;
+ }
+
+ if (!rc) {

You should change this to:

if (rc) {
dev_dbg(&adapter->dev, "Error reading DIB, rc : %d\n", rc);
return rc;
}

And then leave the happy path below unindented.


Agreed. That would be neater. Will rewrite it. Thanks!

+ /**
+ * setting up the supporting commands based on minor rev#
+ * see PECI Spec Table 3-1
+ */
+ dib = (dib >> 8) & 0xF;
+
+ if (dib >= 0x1) {
+ adapter->cmd_mask |= BIT(PECI_CMD_RD_PKG_CFG);
+ adapter->cmd_mask |= BIT(PECI_CMD_WR_PKG_CFG);
+ }
+
+ if (dib >= 0x2)
+ adapter->cmd_mask |= BIT(PECI_CMD_RD_IA_MSR);
+
+ if (dib >= 0x3) {
+ adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG_LOCAL);
+ adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG_LOCAL);
+ }
+
+ if (dib >= 0x4)
+ adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG);
+
+ if (dib >= 0x5)
+ adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG);
+
+ if (dib >= 0x6)
+ adapter->cmd_mask |= BIT(PECI_CMD_WR_IA_MSR);
+
+ adapter->cmd_mask |= BIT(PECI_CMD_GET_TEMP);
+ adapter->cmd_mask |= BIT(PECI_CMD_GET_DIB);
+ adapter->cmd_mask |= BIT(PECI_CMD_PING);

These cmd_mask updates are not done with any locking in mind. Is this
intentional? Or: is synchronization not necessary because this is
always done during enumeration prior to exposing the adapter to users?


Thanks for the pointing it out. This function should be done in a locked scope as you said. I'll adjust mutex protection scope to make that covers this function as well.

+ } else {
+ dev_dbg(&adapter->dev, "Error reading DIB, rc : %d\n", rc);
+ }
+
+ return rc;
+}
+
+static int peci_cmd_support(struct peci_adapter *adapter, enum peci_cmd cmd)
+{
+ if (!(adapter->cmd_mask & BIT(PECI_CMD_PING)) &&
+ peci_scan_cmd_mask(adapter) < 0) {
+ dev_dbg(&adapter->dev, "Failed to scan command mask\n");
+ return -EIO;
+ }
+
+ if (!(adapter->cmd_mask & BIT(cmd))) {
+ dev_dbg(&adapter->dev, "Command %d is not supported\n", cmd);
+ return -EINVAL;
+ }

It would be nicer if you did this check prior to dispatching to the
various subfunctions (peci_ioctl_ping, peci_ioctl_get_dib, etc.). In
that way, these functions could just assume the adapter supports them.


Agreed. I'll drop all individual calls from subfunctions and will call it from peci_command().

[..]
+static int peci_register_adapter(struct peci_adapter *adapter)
+{
+ int res = -EINVAL;
+
+ /* Can't register until after driver model init */
+ if (WARN_ON(!is_registered)) {

Is this solving a problem you actually ran into?


Generally, an adapter driver registration will be happened after the PECI bus registration because peci_init uses postcore_initcall, but in case of incorrect implementation of an adapter driver which uses
a preceding postcore_initcall or a core_initcall as its module init, then an adapter registration would be prior to bus registration. This code is an exceptional case handling for that to warn the incorrect adapter driver implementation.

[.. skipped review due to fatigue ..]

+++ b/include/linux/peci.h
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Intel Corporation
+
+#ifndef __LINUX_PECI_H
+#define __LINUX_PECI_H
+
+#include <linux/cdev.h>
+#include <linux/device.h>
+#include <linux/peci-ioctl.h>
+#include <linux/rtmutex.h>
+
+#define PECI_BUFFER_SIZE 32
+#define PECI_NAME_SIZE 32
+
+struct peci_xfer_msg {
+ u8 addr;
+ u8 tx_len;
+ u8 rx_len;
+ u8 tx_buf[PECI_BUFFER_SIZE];
+ u8 rx_buf[PECI_BUFFER_SIZE];
+} __attribute__((__packed__));

The packed attribute has historically caused gcc to emit atrocious code,
as it seems to assume packed implies members might not be naturally
aligned. Seeing as you're only working with u8s in this case, though,
this shouldn't be a problem.


It should be a packed struct because it is also being used for CRC8 calculation which is treating it as a contiguous byte array.

+struct peci_board_info {
+ char type[PECI_NAME_SIZE];
+ u8 addr; /* CPU client address */
+ struct device_node *of_node;
+};
+
+struct peci_adapter {
+ struct module *owner;
+ struct rt_mutex bus_lock;

Why an rt_mutex, instead of a regular mutex. Do you explicitly need PI
in mainline?


Currently this implementation has only a temperature monitoring sideband feature but other sideband features such as CPU error detection and crash dump will be implemented later, and those additional sideband features should have higher priority than the temperature monitoring feature so it is the reason why I used an rt_mutex.

+ struct device dev;
+ struct cdev cdev;
+ int nr;
+ char name[PECI_NAME_SIZE];
+ int (*xfer)(struct peci_adapter *adapter,
+ struct peci_xfer_msg *msg);
+ uint cmd_mask;
+};
+
+#define to_peci_adapter(d) container_of(d, struct peci_adapter, dev)

You can also do this with a static inline, which provides a marginally
better error when screwed up.


Agreed. That would be more helpful for debugging in debug build. I'll rewrite the macro to a static inline like below:

static inline struct peci_adapter *to_peci_adapter(void *d)
{
return container_of(d, struct peci_adapter, dev);
}

Julia