Re: [REGRESSION] hwmon: (applesmc) avoid overlong udelay()

From: Brad Campbell
Date: Mon Nov 02 2020 - 19:17:51 EST


On 6/10/20 6:02 pm, Andreas Kemnade wrote:
On Thu, 1 Oct 2020 21:07:51 -0700
Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

On 10/1/20 3:22 PM, Andreas Kemnade wrote:
On Wed, 30 Sep 2020 22:00:09 +0200
Arnd Bergmann <arnd@xxxxxxxx> wrote:
On Wed, Sep 30, 2020 at 6:44 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

On Wed, Sep 30, 2020 at 10:54:42AM +0200, Andreas Kemnade wrote:
Hi,

after the $subject patch I get lots of errors like this:

For reference, this refers to commit fff2d0f701e6 ("hwmon: (applesmc)
avoid overlong udelay()").
[ 120.378614] applesmc: send_byte(0x00, 0x0300) fail: 0x40
[ 120.378621] applesmc: LKSB: write data fail
[ 120.512782] applesmc: send_byte(0x00, 0x0300) fail: 0x40
[ 120.512787] applesmc: LKSB: write data fail

CPU sticks at low speed and no fan is turning on.
Reverting this patch on top of 5.9-rc6 solves this problem.

Some information from dmidecode:

Base Board Information
Manufacturer: Apple Inc.
Product Name: Mac-7DF21CB3ED6977E5
Version: MacBookAir6,2

Handle 0x0020, DMI type 11, 5 bytes OEM Strings String 1: Apple ROM Version. Model: …,
Handle 0x0020, DMI type 11, 5 bytes
OEM Strings
String 1: Apple ROM Version. Model: MBA61. EFI Version: 122.0.0
String 2: .0.0. Built by: root@saumon. Date: Wed Jun 10 18:
String 3: 10:36 PDT 2020. Revision: 122 (B&I). ROM Version: F000_B
String 4: 00. Build Type: Official Build, Release. Compiler: Appl
String 5: e clang version 3.0 (tags/Apple/clang-211.10.1) (based on LLVM
String 6: 3.0svn).

Writing to things in /sys/devices/platform/applesmc.768 gives also the
said errors.
But writing 1 to fan1_maunal and 5000 to fan1_output turns the fan on
despite error messages.
Not really sure what to do here. I could revert the patch, but then we'd gain
clang compile failures. Arnd, any idea ?

It seems that either I made a mistake in the conversion and it sleeps for
less time than before, or my assumption was wrong that converting a delay to
a sleep is safe here.

The error message indicates that the write fails, not the read, so that
is what I'd look at first. Right away I can see that the maximum time to
retry is only half of what it used to be, as we used to wait for
0x10, 0x20, 0x40, 0x80, ..., 0x20000 microseconds for a total of
0x3fff0 microseconds (262ms), while my patch went with the 131ms
total delay based on the comment saying "/* wait up to 128 ms for a
status change. */".
Yes, that is also what I read from the code. I just thought there must
be something simple, which just needs a short look from another pair of
eyes.
Since there is sleeping wait, I see no reason the timeout couldn't
be extended a lot, e.g. to a second, as in

#define APPLESMC_MAX_WAIT 0x100000

If that doesn't work, I'd try using mdelay() in place of
usleep_range(), such as

mdelay(DIV_ROUND_UP(us, USEC_PER_MSEC)));

This adds back a really nasty latency, but it should avoid the
compile-time problem.

Andreas, can you try those two things? (one at a time,
not both)

Ok, I tried. None of them works. I rechecked my work and created real
git commits out of them and CONFIG_LOCALVERSION_AUTO is also set so
the usual stupid things are rules out.
In detail:
On top of 5.9-rc6 + *reverted* patch:
diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index fd99c9df8a00..2a9bd7f2b71b 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -45,7 +45,7 @@
/* wait up to 128 ms for a status change. */
#define APPLESMC_MIN_WAIT 0x0010
#define APPLESMC_RETRY_WAIT 0x0100
-#define APPLESMC_MAX_WAIT 0x20000
+#define APPLESMC_MAX_WAIT 0x8000
#define APPLESMC_READ_CMD 0x10
#define APPLESMC_WRITE_CMD 0x11

Oh man, that code is so badlys broken.

send_byte() repeats sending the data if it was not immediately successful.
That is done for both data and commands. Effectively that happens if
the command is not immediately accepted. However, send_argument()
clearly assumes that each data byte is sent exactly once. Sending
it more than once will mess up the key that is supposed to be sent.
The Apple SMC emulation code in qemu confirms that data bytes can not
be written more than once.

Of course, theoretically it may be that the first data byte was not
accepted (after all, the ACK bit is not set), but the ACK bit is
not checked again after udelay(APPLESMC_RETRY_WAIT), so it may
well have been set in the 256 uS between its check and re-writing
the data.

In other words, this entire code only works accidentally to start with.

If you like, you could play around with the code and find out if and
when exactly bit 1 (busy) is set, if and when bit 2 (ack) is set, and
if and when any other bit is set. We could also try to read port 0x31e
(the error port). Maybe the we can figure out what the error actually
is. But then I don't really know what we could do with that information.

Smoe research results: the second data byte seems to cause problems, not the
command byte.

Other than that, the only useful idea I have is something crazy like
if (us < 10000)
udelay(us);
else
mdelay(DIV_ROUND_CLOSEST(udelay, 1000));
in the hope that clang doesn't convert that back into a
compile-time constant and udelay().

Overall it seems like the apple protocol may expect to receive data
bytes faster than 1ms apart, because that is the only real difference
between the original code and the new code using mdelay().

Yes, that explanation makes sense. If I am trying something like that, only
the last byte requires more than APPLESMC_MIN_WAIT. I have seen max. 256us.
So we could probably even use msleep for us > 1000 and udelay for anything below.

Regards,
Andreas


G'day Andreas,

I've examined the code in VirtualSMC and I'm not convinced we were not waiting on the wrong bits.

#define SMC_STATUS_AWAITING_DATA BIT0 ///< Ready to read data.
#define SMC_STATUS_IB_CLOSED BIT1 /// A write is pending.
#define SMC_STATUS_BUSY BIT2 ///< Busy processing a command.
#define SMC_STATUS_GOT_COMMAND BIT3 ///< The last input was a command.
#define SMC_STATUS_UKN_0x16 BIT4
#define SMC_STATUS_KEY_DONE BIT5
#define SMC_STATUS_READY BIT6 // Ready to work
#define SMC_STATUS_UKN_0x80 BIT7 // error

Any chance you could try this patch? It's ugly, hacked together and currently fairly undocumented, but if it works I'll figure out how to clean it up (suggestions welcome).
It works on my MacbookPro 11,1.

I've also attached a script I adapted from https://github.com/floe/smc_util.git that I use to hammer the SMC with reads.
The behavior pre 5.9 and post this patch is the same.

Regards,
Brad
diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index a18887990f4a..2500cc7ebca5 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -151,65 +151,58 @@ static unsigned int key_at_index;
static struct workqueue_struct *applesmc_led_wq;

/*
- * wait_read - Wait for a byte to appear on SMC port. Callers must
- * hold applesmc_lock.
+ * Wait for specific status bits on the SMC
*/
-static int wait_read(void)
+
+static int wait_status(u8 val, u8 mask)
{
- unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;
u8 status;
int us;
+ unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;

for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {
- usleep_range(us, us * 16);
status = inb(APPLESMC_CMD_PORT);
- /* read: wait for smc to settle */
- if (status & 0x01)
+ if ((status & mask) == val)
return 0;
- /* timeout: give up */
if (time_after(jiffies, end))
break;
- }
-
- pr_warn("wait_read() fail: 0x%02x\n", status);
+ usleep_range(us, us * 16);
+ }
+ pr_warn("wait_status timeout: 0x%02x\n", status);
return -EIO;
}
-
+
/*
* send_byte - Write to SMC port, retrying when necessary. Callers
* must hold applesmc_lock.
*/
-static int send_byte(u8 cmd, u16 port)
-{
- u8 status;
- int us;
- unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;

+static int send_byte_data(u8 cmd, u16 port, bool skip)
+{
+ u8 wstat=0x44;
+ if (skip)
+ wstat=0x40;
+ if (wait_status(0x44, 0xF6))
+ goto fail;
outb(cmd, port);
- for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {
- usleep_range(us, us * 16);
- status = inb(APPLESMC_CMD_PORT);
- /* write: wait for smc to settle */
- if (status & 0x02)
- continue;
- /* ready: cmd accepted, return */
- if (status & 0x04)
- return 0;
- /* timeout: give up */
- if (time_after(jiffies, end))
- break;
- /* busy: long wait and resend */
- udelay(APPLESMC_RETRY_WAIT);
- outb(cmd, port);
- }
-
- pr_warn("send_byte(0x%02x, 0x%04x) fail: 0x%02x\n", cmd, port, status);
+ if (! wait_status(wstat, 0xFE))
+ return 0;
+fail:
+ pr_warn("send_byte_data(0x%02x, 0x%04x) fail\n", cmd, APPLESMC_CMD_PORT);
return -EIO;
}

static int send_command(u8 cmd)
{
- return send_byte(cmd, APPLESMC_CMD_PORT);
+ if (wait_status(0x40, 0xF2)) { pr_warn("send_command fail 1\n");
+ goto fail;}
+ outb(cmd, APPLESMC_CMD_PORT);
+ if (wait_status(0x4C, 0xFF)) { pr_warn("send_command fail 2\n");
+ goto fail;}
+ return 0;
+fail:
+ pr_warn("send_cmd(0x%02x, 0x%04x) fail\n", cmd, APPLESMC_CMD_PORT);
+ return -EIO;
}

static int send_argument(const char *key)
@@ -217,7 +210,7 @@ static int send_argument(const char *key)
int i;

for (i = 0; i < 4; i++)
- if (send_byte(key[i], APPLESMC_DATA_PORT))
+ if (send_byte_data(key[i], APPLESMC_DATA_PORT,false))
return -EIO;
return 0;
}
@@ -233,13 +226,13 @@ static int read_smc(u8 cmd, const char *key, u8 *buffer, u8 len)
}

/* This has no effect on newer (2012) SMCs */
- if (send_byte(len, APPLESMC_DATA_PORT)) {
+ if (send_byte_data(len, APPLESMC_DATA_PORT,false)) {
pr_warn("%.4s: read len fail\n", key);
return -EIO;
}

for (i = 0; i < len; i++) {
- if (wait_read()) {
+ if (wait_status(0x45,0xFF)) {
pr_warn("%.4s: read data[%d] fail\n", key, i);
return -EIO;
}
@@ -269,14 +262,14 @@ static int write_smc(u8 cmd, const char *key, const u8 *buffer, u8 len)
return -EIO;
}

- if (send_byte(len, APPLESMC_DATA_PORT)) {
+ if (send_byte_data(len, APPLESMC_DATA_PORT,false)) {
pr_warn("%.4s: write len fail\n", key);
return -EIO;
}

for (i = 0; i < len; i++) {
- if (send_byte(buffer[i], APPLESMC_DATA_PORT)) {
- pr_warn("%s: write data fail\n", key);
+ if (send_byte_data(buffer[i], APPLESMC_DATA_PORT,(i==(len-1)))) {
+ pr_warn("%s: write data fail at %i\n", key, i);
return -EIO;
}
}

Attachment: smc_dump_linux.sh
Description: application/shellscript