Re: [PATCH 1/3] acpi: Rip out EC_FLAGS_QUERY_PENDING (prevent racecondition)
From: Alan Jenkins
Date: Sat Jul 19 2008 - 16:41:31 EST
Alexey Starikovskiy wrote:
> Alan,
Thanks for your robust response. Sorry for not discussing this more in
the first place. I wanted to write the code that fixes my system first,
and post it as soon as possible. I think I missed a flag to say "this
post is not an immediate request for submission".
> I don't think this is very good idea -- now every interrupt will add a
> new entry to workqueue,
> and given the rate of this interrupt, it could be become quite long.
>
I confess I gave up too quickly on the synchronisation issues and
resorted to brute force. Patch #1 is preventing a narrow race:
- acpi_ec_write_cmd() (QUERY)
- EC writes 0 (no event) to input buffer
- new event arrives, triggering a GPE interrupt which is ignored because
QUERY_PENDING is set.
- QUERY_PENDING is cleared (but too late).
Now we ignored an event, which will be delayed until the next GPE interrupt.
The original reason for patch #1 was to stop patch #2 driving me
insane. If you apply the second patch on it's own, you break
EC_FLAGS_QUERY_PENDING. It gets cleared as soon as the _first_ query
command is submitted. It works, but it means the flag no longer has a
clearly defined meaning. I tried to fix that by only clearing it once
the loop has finished - and then realised I was widening a pre-existing
race condition.
As you say, I took the easy way out and pushed it all into the
workqueue. So the workqueue ends up as a buffer of GPE occurrences. I
did look at reducing the memory usage while avoiding race conditions,
but I couldn't find a reasonable solution. But I looked at it again now
and I have a better solution:
...
/* moved here from acpi_ec_transaction_unlocked() */
clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags);
while (acpi_ec_query(ec, &value) == 0)
acpi_ec_gpe_run_handler(ec, value);
}
The PENDING flag then reflects whether or not there's an item on the
workqueue. Does that make sense?
> Your second patch is redundant if you add queue entry for each
> interrupt, and as it does not
> require any investments into memory, I like it more.
>
It's not quite redundant with the first patch. We still have GPE
polling mode - patch #1 doesn't affect that. In polling mode, it's
essential to query all the pending events - otherwise, if they arrive
more frequently than the polling interval then you will inevitably drop
events.
Patch #2 is also required to fix my buggy hardware. My laptop's EC
buffers multiple events, but clears SCI_EVT after every query. This
caused problems in polling mode; with multiple events between polling
intervals only one gets queried - and after a while the buffer overflows
and it breaks completely.
> Also, it is very cool to rip of things you don't care for, but that
> essentially reverts a fix done for 9998,
> so at least, you should ask these people if you broke their setups.
>
I assume you're referring to the "would benefit from wider testing"
patch #3. Thanks for identifying the bugzilla entry - I had difficulty
separating the different entries on GPEs. I'm optimistic that we can
fix all these crazy buffering EC's without having to disable GPE interrupts.
The reason I like my proposed fix is that it makes the code simple
enough that I can understand it, without making any assumptions about
how many interrupts arrive per GPE. The EC can be broken in lots of
ways, so long as:
1. We receive interrupts when one or more GPE's are pending.
2. We don't get a constant interrupt storm.
I don't think I missed anything. Is there anything else I should check
before I try to get testing?
Thanks
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/