On Thu, 1 Oct 2020 21:07:51 -0700
Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
On 10/1/20 3:22 PM, Andreas Kemnade wrote:Smoe research results: the second data byte seems to cause problems, not the
On Wed, 30 Sep 2020 22:00:09 +0200
Arnd Bergmann <arnd@xxxxxxxx> wrote:
On Wed, Sep 30, 2020 at 6:44 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:Yes, that is also what I read from the code. I just thought there must
On Wed, Sep 30, 2020 at 10:54:42AM +0200, Andreas Kemnade wrote:
Hi,
after the $subject patch I get lots of errors like this:
For reference, this refers to commit fff2d0f701e6 ("hwmon: (applesmc)
avoid overlong udelay()").
[ 120.378614] applesmc: send_byte(0x00, 0x0300) fail: 0x40Not really sure what to do here. I could revert the patch, but then we'd gain
[ 120.378621] applesmc: LKSB: write data fail
[ 120.512782] applesmc: send_byte(0x00, 0x0300) fail: 0x40
[ 120.512787] applesmc: LKSB: write data fail
CPU sticks at low speed and no fan is turning on.
Reverting this patch on top of 5.9-rc6 solves this problem.
Some information from dmidecode:
Base Board Information
Manufacturer: Apple Inc.
Product Name: Mac-7DF21CB3ED6977E5
Version: MacBookAir6,2
Handle 0x0020, DMI type 11, 5 bytes OEM Strings String 1: Apple ROM Version. Model: …,
Handle 0x0020, DMI type 11, 5 bytes
OEM Strings
String 1: Apple ROM Version. Model: MBA61. EFI Version: 122.0.0
String 2: .0.0. Built by: root@saumon. Date: Wed Jun 10 18:
String 3: 10:36 PDT 2020. Revision: 122 (B&I). ROM Version: F000_B
String 4: 00. Build Type: Official Build, Release. Compiler: Appl
String 5: e clang version 3.0 (tags/Apple/clang-211.10.1) (based on LLVM
String 6: 3.0svn).
Writing to things in /sys/devices/platform/applesmc.768 gives also the
said errors.
But writing 1 to fan1_maunal and 5000 to fan1_output turns the fan on
despite error messages.
clang compile failures. Arnd, any idea ?
It seems that either I made a mistake in the conversion and it sleeps for
less time than before, or my assumption was wrong that converting a delay to
a sleep is safe here.
The error message indicates that the write fails, not the read, so that
is what I'd look at first. Right away I can see that the maximum time to
retry is only half of what it used to be, as we used to wait for
0x10, 0x20, 0x40, 0x80, ..., 0x20000 microseconds for a total of
0x3fff0 microseconds (262ms), while my patch went with the 131ms
total delay based on the comment saying "/* wait up to 128 ms for a
status change. */".
be something simple, which just needs a short look from another pair of
eyes.
Since there is sleeping wait, I see no reason the timeout couldn't
be extended a lot, e.g. to a second, as in
#define APPLESMC_MAX_WAIT 0x100000
If that doesn't work, I'd try using mdelay() in place of
usleep_range(), such as
mdelay(DIV_ROUND_UP(us, USEC_PER_MSEC)));
This adds back a really nasty latency, but it should avoid the
compile-time problem.
Andreas, can you try those two things? (one at a time,
not both)
Ok, I tried. None of them works. I rechecked my work and created real
git commits out of them and CONFIG_LOCALVERSION_AUTO is also set so
the usual stupid things are rules out.
In detail:
On top of 5.9-rc6 + *reverted* patch:
diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index fd99c9df8a00..2a9bd7f2b71b 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -45,7 +45,7 @@
/* wait up to 128 ms for a status change. */
#define APPLESMC_MIN_WAIT 0x0010
#define APPLESMC_RETRY_WAIT 0x0100
-#define APPLESMC_MAX_WAIT 0x20000
+#define APPLESMC_MAX_WAIT 0x8000
#define APPLESMC_READ_CMD 0x10
#define APPLESMC_WRITE_CMD 0x11
Oh man, that code is so badlys broken.
send_byte() repeats sending the data if it was not immediately successful.
That is done for both data and commands. Effectively that happens if
the command is not immediately accepted. However, send_argument()
clearly assumes that each data byte is sent exactly once. Sending
it more than once will mess up the key that is supposed to be sent.
The Apple SMC emulation code in qemu confirms that data bytes can not
be written more than once.
Of course, theoretically it may be that the first data byte was not
accepted (after all, the ACK bit is not set), but the ACK bit is
not checked again after udelay(APPLESMC_RETRY_WAIT), so it may
well have been set in the 256 uS between its check and re-writing
the data.
In other words, this entire code only works accidentally to start with.
If you like, you could play around with the code and find out if and
when exactly bit 1 (busy) is set, if and when bit 2 (ack) is set, and
if and when any other bit is set. We could also try to read port 0x31e
(the error port). Maybe the we can figure out what the error actually
is. But then I don't really know what we could do with that information.
command byte.
Other than that, the only useful idea I have is something crazy like
if (us < 10000)
udelay(us);
else
mdelay(DIV_ROUND_CLOSEST(udelay, 1000));
in the hope that clang doesn't convert that back into a
compile-time constant and udelay().
Overall it seems like the apple protocol may expect to receive data
bytes faster than 1ms apart, because that is the only real difference
between the original code and the new code using mdelay().
Yes, that explanation makes sense. If I am trying something like that, only
the last byte requires more than APPLESMC_MIN_WAIT. I have seen max. 256us.
So we could probably even use msleep for us > 1000 and udelay for anything below.
Regards,
Andreas
Attachment:
smc_dump_linux.sh
Description: application/shellscript