Re: [2.6.28-rc2] EeePC ACPI errors & exceptions

From: Zhao Yakui
Date: Wed Oct 29 2008 - 21:26:20 EST


On Wed, 2008-10-29 at 17:29 +0800, Alexey Starikovskiy wrote:
> Not a problem, just find the root cause. Or shut up.
Maybe you don't read the explanation I have written.
>
> Zhao Yakui wrote:
> > On Tue, 2008-10-28 at 13:46 -0700, Alexey Starikovskiy wrote:
> >
> >> Hi Darren,
> >>
> >> Please check if the patch
> >> http://marc.info/?l=linux-acpi&m=122516784917952&w=4
> >> helps.
> >>
> > In the attached patch the msleep is replaced by udelay gain.
> > In the following commit the udelay is replaced by msleep.
> > >commit 1b7fc5aae8867046f8d3d45808309d5b7f2e036a
> > >Author: Alexey Starikovskiy <astarikovskiy@xxxxxxx>
> > >Date: Fri Jun 6 11:49:33 2008 -0400
> > >ACPI: EC: Use msleep instead of udelay while waiting for event
> >
> > After the problem happens again, the udelay is restored again before
> > getting the root cause.
> > Maybe we should find the root cause of the problem and change the
> > working flowchart about the EC driver. It is inappropriate that we make
> > some changes and it is reverted again when the problem happens.
> >
> > At the same time after mlseep is replaced by the udelay, the CPU will
> > do thing but loop while doing EC transaction on some laptops (In the
> > function of ec_poll). If 100 EC transactions are done, the CPU will do
> > nothing but loop at least for 100*2*100 microseconds. In such case maybe
> > the performance will be affected.
> >
> > After the following commit is merged, the EC transaction will be
> > executed in EC GPE interrupt context on most laptops.Maybe it is easier.
> > But for the some laptops it can't be done in EC GPE interrupt context.
> > So it falls back to the EC polling mode. (This is realized by the
> > function of ec_poll).
> > >commit 7c6db4e050601f359081fde418ca6dc4fc2d0011
> > >Author: Alexey Starikovskiy <astarikovskiy@xxxxxxx>
> > >Date: Thu Sep 25 21:00:31 2008 +0400
> > >ACPI: EC: do transaction from interrupt context
> >
The following is the detailed explanation why this issue happens. In
fact after you sent your patch, I raise the issue about it. But it is
ignored. (Maybe the AE_TIME will be returned by EC driver. But the
reason is not caused by that EC controller can't update its status in
time. Instead it is caused by that host has no opportunity to issue the
sequence EC command.)
>
> > Why is AE_TIME sometimes returned by the function of ec_poll?
> >
> >> static int ec_poll(struct acpi_ec *ec)
> >>
> > {
> > unsigned long delay = jiffies + msecs_to_jiffies(ACPI_EC_DELAY);
> > msleep(1);
> > // Maybe the current jiffies is already after the predefined jiffies
> > after msleep(1). In such case the ETIME will be returned. Of course the
> > EC transaction can't be finished. If so, IMO this is not reasonable as
> > this is caused by that OS has no opportunity to issue the following EC
> > command sequence.
> > while (time_before(jiffies, delay)) {
> > gpe_transaction(ec, acpi_ec_read_status(ec));
> > msleep(1);
> > if (ec_transaction_done(ec))
> > return 0;
> > //Maybe there exists the following cases. EC transaction is not finished
> > after msleep(1),but the current jiffies is already after predefined
> > jiffies. So ETIME is returned. In such case, IMO this is also not
> > reasonable.
> > }
> > return -ETIME;
> > }
> > At the same time msleep is realized by schedule_timeout. On linux
> > although one process is waked up by some events, it won't be scheduled
> > immediately. So maybe the current jiffies is already after the
> > predefined timeout jiffies after msleep(1).
> > Although the possibility of this issue can be reduced by that msleep
> > is replaced by udelay,maybe the issue still exists if the preempt
> > schedule happens at the corresponding place.
> >
> > In the above case the ETIME will be returned by ec_poll. But the
> > reason is not that EC controller can't update its status in time.
> > Instead it is caused by that host has no opportunity to issue the
> > sequence operation in the current work flowchart. In current EC work
> > flowchart the EC transaction is done in a big loop.
> >
> > Maybe the better solution is that the EC transaction is explicitly
> > divided into several different phases.
> >
> > Maybe my analysis is not correct. If so, please correct me.
> > Welcome the comments.
> >
> > thanks.
> >
> >
> >
> >
> >> Thanks,
> >> Alex.
> >>
> >>
> >>
> >
> >
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/