Re: [PATCH v3] applesmc: Re-work SMC comms

From: Guenter Roeck
Date: Mon Nov 09 2020 - 23:56:23 EST


On Tue, Nov 10, 2020 at 01:04:04PM +1100, Brad Campbell wrote:
> On 9/11/20 3:06 am, Guenter Roeck wrote:
> > On 11/8/20 2:14 AM, Henrik Rydberg wrote:
> >> On Sun, Nov 08, 2020 at 09:35:28AM +0100, Henrik Rydberg wrote:
> >>> Hi Brad,
> >>>
> >>> On 2020-11-08 02:00, Brad Campbell wrote:
> >>>> G'day Henrik,
> >>>>
> >>>> I noticed you'd also loosened up the requirement for SMC_STATUS_BUSY in read_smc(). I assume
> >>>> that causes problems on the early Macbook. This is revised on the one sent earlier.
> >>>> If you could test this on your Air1,1 it'd be appreciated.
> >>>
> >>> No, I managed to screw up the patch; you can see that I carefully added the
> >>> same treatment for the read argument, being unsure if the BUSY state would
> >>> remain during the AVAILABLE data phase. I can check that again, but
> >>> unfortunately the patch in this email shows the same problem.
> >>>
> >>> I think it may be worthwhile to rethink the behavior of wait_status() here.
> >>> If one machine shows no change after a certain status bit change, then
> >>> perhaps the others share that behavior, and we are waiting in vain. Just
> >>> imagine how many years of cpu that is, combined. ;-)
> >>
> >> Here is a modification along that line.
> >>
> >
> > Please resend this patch as stand-alone v4 patch. If sent like it was sent here,
> > it doesn't make it into patchwork, and is thus not only difficult to apply but
> > may get lost, and it is all but impossible to find and apply all tags.
> > Also, prior to Henrik's Signed=off-by: there should be a one-line explanation
> > of the changes made.
> >
> > Thanks,
> > Guenter
> >
> >> Compared to your latest version, this one has wait_status() return the
> >> actual status on success. Instead of waiting for BUSY, it waits for
> >> the other status bits, and checks BUSY afterwards. So as not to wait
> >> unneccesarily, the udelay() is placed together with the single
> >> outb(). The return value of send_byte_data() is augmented with
> >> -EAGAIN, which is then used in send_command() to create the resend
> >> loop.
> >>
> >> I reach 41 reads per second on the MBA1,1 with this version, which is
> >> getting close to the performance prior to the problems.
> >>
>
> Can I get an opinion on this wait statement please?
>
> The apple driver uses a loop with a million (1,000,000) retries spaced with a 10uS delay.
>
> In my testing on 2 machines, we don't busy wait more than about 2 loops.
> Replacing a small udelay with the usleep_range kills performance.
> With the following (do 10 fast checks before we start sleeping) I nearly triple the performance
> of the driver on my laptop, and double it on my iMac. This is on an otherwise unmodified version of
> Henriks v4 submission.
>
> Yes, given the timeouts I know it's a ridiculous loop condition.
>
> static int wait_status(u8 val, u8 mask)
> {
> unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;
> u8 status;
> int i;
>
> for (i=1; i < 1000000; i++) {

The minimum wait time is 10 us, or 16uS after the first 10
attempts. 1000000 * 10 = 10 seconds. I mean, it would make
some sense to limit the loop to APPLESMC_MAX_WAIT /
APPLESMC_MIN_WAIT iterations, but why 1,000,000 ?

> status = inb(APPLESMC_CMD_PORT);
> if ((status & mask) == val)
> return status;
> /* timeout: give up */
> if (time_after(jiffies, end))
> break;
> if (i < 10)
> udelay(10);
> else
> usleep_range(APPLESMC_MIN_WAIT, APPLESMC_MIN_WAIT * 16);

The original code had the exponential wait time increase.
I don't really see the point of changing that. I'd suggest
to keep the exponential increase but change the code to
something like
if (us < APPLESMC_MIN_WAIT * 4)
udelay(us)
else
usleep_range(us, us * 16);

Effectively that means the first wait would be 16 uS,
followed by 32 uS, followed by increasingly larger sleep
times. I don't know the relevance of APPLESMC_MIN_WAIT
being set to 16, but if you'd want to start with smaller
wait times you could reduce it to 8. If you are concerned
about excessively large sleep times you could reduce
the span from us..us*16 to, say, us..us*4 or us..us*2.

Thanks,
Guenter

> }
> return -EIO;
> }
>
> Regards,
> Brad