Re: [PATCH v4 2/2] ACPI / button: Add document for ACPI control method lid device restrictions

From: Benjamin Tissoires
Date: Fri Jul 22 2016 - 05:08:53 EST


On Fri, Jul 22, 2016 at 10:47 AM, Zheng, Lv <lv.zheng@xxxxxxxxx> wrote:
> Hi,
>
>> From: Benjamin Tissoires [mailto:benjamin.tissoires@xxxxxxxxx]
>> Subject: Re: [PATCH v4 2/2] ACPI / button: Add document for ACPI control
>> method lid device restrictions
>>
>> On Fri, Jul 22, 2016 at 6:37 AM, Dmitry Torokhov
>> <dmitry.torokhov@xxxxxxxxx> wrote:
>> > Hi Lv,
>> >
>> > On Fri, Jul 22, 2016 at 12:24:50AM +0000, Zheng, Lv wrote:
>> >> Hi, Dmitry
>> >>
>> >>
>> >> Thanks for the review.
>> >>
>> >> > From: Dmitry Torokhov [mailto:dmitry.torokhov@xxxxxxxxx]
>> >> > Subject: Re: [PATCH v4 2/2] ACPI / button: Add document for ACPI
>> control
>> >> > method lid device restrictions
>> >> >
>> >> > On Tue, Jul 19, 2016 at 04:11:21PM +0800, Lv Zheng wrote:
>> >> > > This patch adds documentation for the usage model of the control
>> >> > method lid
>> >> > > device.
>> >> > >
>> >> > > Signed-off-by: Lv Zheng <lv.zheng@xxxxxxxxx>
>> >> > > Cc: Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>
>> >> > > Cc: Benjamin Tissoires <benjamin.tissoires@xxxxxxxxx>
>> >> > > Cc: Bastien Nocera: <hadess@xxxxxxxxxx>
>> >> > > Cc: linux-input@xxxxxxxxxxxxxxx
>> >> > > ---
>> >> > > Documentation/acpi/acpi-lid.txt | 89
>> >> > +++++++++++++++++++++++++++++++++++++++
>> >> > > 1 file changed, 89 insertions(+)
>> >> > > create mode 100644 Documentation/acpi/acpi-lid.txt
>> >> > >
>> >> > > diff --git a/Documentation/acpi/acpi-lid.txt
>> b/Documentation/acpi/acpi-
>> >> > lid.txt
>> >> > > new file mode 100644
>> >> > > index 0000000..2addedc
>> >> > > --- /dev/null
>> >> > > +++ b/Documentation/acpi/acpi-lid.txt
>> >> > > @@ -0,0 +1,89 @@
>> >> > > +Usage Model of the ACPI Control Method Lid Device
>> >> > > +
>> >> > > +Copyright (C) 2016, Intel Corporation
>> >> > > +Author: Lv Zheng <lv.zheng@xxxxxxxxx>
>> >> > > +
>> >> > > +
>> >> > > +Abstract:
>> >> > > +
>> >> > > +Platforms containing lids convey lid state (open/close) to OSPMs
>> using
>> >> > a
>> >> > > +control method lid device. To implement this, the AML tables issue
>> >> > > +Notify(lid_device, 0x80) to notify the OSPMs whenever the lid
>> state has
>> >> > > +changed. The _LID control method for the lid device must be
>> >> > implemented to
>> >> > > +report the "current" state of the lid as either "opened" or "closed".
>> >> > > +
>> >> > > +This document describes the restrictions and the expections of the
>> >> > Linux
>> >> > > +ACPI lid device driver.
>> >> > > +
>> >> > > +
>> >> > > +1. Restrictions of the returning value of the _LID control method
>> >> > > +
>> >> > > +The _LID control method is described to return the "current" lid
>> state.
>> >> > > +However the word of "current" has ambiguity, many AML tables
>> return
>> >> > the lid
>> >> >
>> >> > Can this be fixed in the next ACPI revision?
>> >> [Lv Zheng]
>> >> Even this is fixed in the ACPI specification, there are platforms already
>> doing this.
>> >> Especially platforms from Microsoft.
>> >> So the de-facto standard OS won't care about the change.
>> >> And we can still see such platforms.
>> >>
>> >> Here is an example from Surface 3:
>> >>
>> >> DefinitionBlock ("dsdt.aml", "DSDT", 2, "ALASKA", "A M I ", 0x01072009)
>> >> {
>> >> Scope (_SB)
>> >> {
>> >> Device (PCI0)
>> >> {
>> >> Name (_HID, EisaId ("PNP0A08")) // _HID: Hardware ID
>> >> Name (_CID, EisaId ("PNP0A03")) // _CID: Compatible ID
>> >> Device (SPI1)
>> >> {
>> >> Name (_HID, "8086228E") // _HID: Hardware ID
>> >> Device (NTRG)
>> >> {
>> >> Name (_HID, "MSHW0037") // _HID: Hardware ID
>> >> }
>> >> }
>> >> }
>> >>
>> >> Device (LID)
>> >> {
>> >> Name (_HID, EisaId ("PNP0C0D"))
>> >> Name (LIDB, Zero)
>> >
>> > Start with lid closed? In any case might be wrong.
>>
>> Actually the initial value doesn't matter if the gpiochip triggers the
>> _EC4 at boot, which it should
>> (https://github.com/hadess/fedora-surface3-
>> kernel/commit/13200f81662c1c0b58137947c3e6c000fe62a2ba,
>> still unsubmitted)
>>
>> >
>> >> Method (_LID, 0, NotSerialized)
>> >> {
>> >> Return (LIDB)
>> >
>> > So "_LID" returns the last state read by "_EC4". "_EC4" is
>> > edge-triggered and will be evaluated every time gpio changes state.
>>
>> That's assuming the change happens while the system is on. If you go
>> into suspend by closing the LID. Open it while on suspend and then hit
>> the power button given that the system doesn't wake up when the lid is
>> opened, the edge change was made while the system is asleep, and we
>> are screwed (from an ACPI point of view). See my next comment for a
>> solution.
>>
> [Lv Zheng]
> I actually not sure if polling can fix all issues.
> For example.
> If a platform reporting "close" after resuming.
> Then polling _LID will always return "close".
> And the userspace can still get the "close" not "open".
> So it seems polling is not working for such platforms (cached value, initial close).
> Surface 3 is not the only platform caching an initial close value.
> There are 2 traditional platforms listed in the patch description.
>
>> >
>> >> }
>> >> }
>> >>
>> >> Device (GPO0)
>> >> {
>> >> Name (_HID, "INT33FF") // _HID: Hardware ID
>> >> OperationRegion (GPOR, GeneralPurposeIo, Zero, One)
>> >> Field (GPOR, ByteAcc, NoLock, Preserve)
>> >> {
>> >> Connection (
>> >> GpioIo (Shared, PullNone, 0x0000, 0x0000,
>> IoRestrictionNone,
>> >> "\\_SB.GPO0", 0x00, ResourceConsumer, ,
>> >> )
>> >> { // Pin list
>> >> 0x004C
>> >> }
>> >> ),
>> >> HELD, 1
>> >
>> > Is it possible to read state of this GPIO from userspace on startup to
>> > correct the initial state?
>> >
>> >> }
>> >> Method (_E4C, 0, Serialized)
>> >> {
>> >> If (LEqual(HELD, One))
>> >> {
>> >> Store(One, ^^LID.LIDB)
>> >>
>> >> There is no "open" event generated by "Surface 3".
>> >
>> > Right so we update the state correctly, we just forgot to send the
>> > notification. Nothing that polling can't fix.
>>
>> Actually, I have a better (though more hackish) way of avoiding polling:
>> https://github.com/hadess/fedora-surface3-
>> kernel/blob/5e5775b9bdc308d665064387e0b144ee48e7b243/0002-WIP-
>> add-custom-surface3-platform-device-for-controll.patch
>>
>> Given that the notification is forwarded to the touchscreen anyway, we
>> can unregister the generic (and buggy) acpi button driver for the LID
>> and create our own based on this specific DSDT.
>> We can also make sure the LID state is also correct because of the WMI
>> method which allows to read the actual value of the GPIO connected to
>> the cover without using the cached (and most of the time wrong) acpi
>> LID.LIDB value.
>>
>> I still yet have to submit this, but with this patch, but we can
>> consider the Surface 3 as working and not an issue anymore.
>>
> [Lv Zheng]
> That could make surface 3 dependent on WMI driver, not ACPI button driver.
> Will this affect other buttons?
> For example, power button/sleep button.

TLDR: no, there is no impact on other buttons.

There are 2 reasons why the impact is limited:
- the patch only removes the input node that contains the LID, and it
is the only one event in the input node
- power/sleep, volume +/- are not handled by ACPI as this is a reduced
platform and these buttons are not notified by ACPI. So we need an
adaptation of the GPIO button array for it to be working (patch
already submitted but I found a non-acpi platform issue, and then not
enough time to fix and send an updated version).

>
> Our approach is to make ACPI button driver working.
> Though this may lead to ABI changes.

Yes, I know you want to fix ACPI button for future non working
tablets/laptops. This is why I gave my rev-by in this series.

>
>> >
>> >>
>> >> }
>> >> Else
>> >> {
>> >> Store(Zero, ^^LID.LIDB)
>> >> Notify (LID, 0x80)
>> >>
>> >> There is only "close" event generated by "Surface 3".
>> >> Thus they are not paired.
>> >>
>> >> }
>> >> Notify (^^PCI0.SPI1.NTRG, One) // Device Check
>> >> }
>> >> }
>> >> }
>> >> }
>> >>
>> >> >
>> >> > > +state upon the last lid notification instead of returning the lid state
>> >> > > +upon the last _LID evaluation. There won't be difference when the
>> _LID
>> >> > > +control method is evaluated during the runtime, the problem is its
>> >> > initial
>> >> > > +returning value. When the AML tables implement this control
>> method
>> >> > with
>> >> > > +cached value, the initial returning value is likely not reliable. There
>> are
>> >> > > +simply so many examples always retuning "closed" as initial lid
>> state.
>> >> > > +
>> >> > > +2. Restrictions of the lid state change notifications
>> >> > > +
>> >> > > +There are many AML tables never notifying when the lid device
>> state is
>> >> > > +changed to "opened". Thus the "opened" notification is not
>> guaranteed.
>> >> > > +
>> >> > > +But it is guaranteed that the AML tables always notify "closed"
>> when
>> >> > the
>> >> > > +lid state is changed to "closed". The "closed" notification is
>> normally
>> >> > > +used to trigger some system power saving operations on Windows.
>> >> > Since it is
>> >> > > +fully tested, the "closed" notification is reliable for all AML tables.
>> >> > > +
>> >> > > +3. Expections for the userspace users of the ACPI lid device driver
>> >> > > +
>> >> > > +The ACPI button driver exports the lid state to the userspace via
>> the
>> >> > > +following file:
>> >> > > + /proc/acpi/button/lid/LID0/state
>> >> > > +This file actually calls the _LID control method described above.
>> And
>> >> > given
>> >> > > +the previous explanation, it is not reliable enough on some
>> platforms.
>> >> > So
>> >> > > +it is advised for the userspace program to not to solely rely on this
>> file
>> >> > > +to determine the actual lid state.
>> >> > > +
>> >> > > +The ACPI button driver emits 2 kinds of events to the user space:
>> >> > > + SW_LID
>> >> > > + When the lid state/event is reliable, the userspace can behave
>> >> > > + according to this input switch event.
>> >> > > + This is a mode prepared for backward compatiblity.
>> >> > > + KEY_EVENT_OPEN/KEY_EVENT_CLOSE
>> >> > > + When the lid state/event is not reliable, the userspace should
>> behave
>> >> > > + according to these 2 input key events.
>> >> > > + New userspace programs may only be prepared for the input key
>> >> > events.
>> >> >
>> >> > No, absolutely not. If some x86 vendors managed to mess up their
>> >> > firmware implementations that does not mean that everyone now
>> has to
>> >> > abandon working perfectly well for them SW_LID events and rush to
>> >> > switch
>> >> > to a brand new event.
>> >> [Lv Zheng]
>> >> However there is no clear wording in the ACPI specification asking the
>> vendors to achieve paired lid events.
>> >>
>> >> >
>> >> > Apparently were are a few issues, main is that some systems not
>> reporting
>> >> > "open" event. This can be dealt with by userspace "writing" to the
>> >> > lid's evdev device EV_SW/SW_LID/0 event upon system resume (and
>> >> > startup)
>> >> > for selected systems. This will mean that if system wakes up not
>> because
>> >> > LID is open we'll incorrectly assume that it is, but we can either add
>> >> > more smarts to the process emitting SW_LID event or simply say
>> "well,
>> >> > tough, the hardware is crappy" and bug vendor to see if they can fix
>> the
>> >> > issue (if not for current firmware them for next).
>> >> [Lv Zheng]
>> >> The problem is there is no vendor actually caring about fixing this
>> "issue".
>> >> Because Windows works well with their firmware.
>> >> Then finally becomes a big table customization business for our team.
>> >
>> > Well, OK. But you do not expect that we will redo up and down the stack
>> > lid handling just because MS messed up DSDT on Surface 3? No, let them
>> > know (they now care about Linux, right?) so Surface 4 works and quirk
>> > the behavior for Surface 3.
>> >
>>
>> From what I understood, it was more than just the Surface 3. Other
>> laptops were having issues and Lv's team gave up on fixing those
>> machines.
>>
>> >>
>> >> >
>> >> > As an additional workaround, we can toggle the LID switch off and on
>> >> > when we get notification, much like your proposed patch does for the
>> key
>> >> > events.
>>
>> I really don't like this approach. The problem being that we will fix
>> the notifications to user space, but nothing will tell userspace that
>> the LID state is known to be wrong.
>> OTOH, I already agreed for a hwdb in userspace so I guess this point is
>> moot.
>>
>> Having both events (one SW for reliable HW, always correct, and one
>> KEY for unreliable HW) allows userspace to make a clear distinction
>> between the working and non working events and they can continue to
>> keep using the polling of the SW node without extra addition.
>>
> [Lv Zheng]
> I think this solution is good and fair for all of the vendors. :-)
>
>> Anyway, if the kernel doesn't want to (or can't) fix the actual issue
>> (by making sure the DSDT is reliable), userspace needs to be changed
>> so any solution will be acceptable.
> [Lv Zheng]
> I think the answer is "can't".
> If we introduced too many workarounds into acpi button driver,
> in order to make something working while the platform firmware doesn't expect it to be working,
> then we'll start to worry about breaking good laptops.

Then you just need to amend the documentation to say that the fallback
of the KEY events is not the "future" but a way to get events on some
reduced platforms and it will not be the default.
Please make sure userspace knows that the default is the good SW_LID,
and some particular cases will need to be handled through the KEY
events, not the other way around.

[few thoughts later]

How about:
- you send only one patch with the SW_LID ON/OFF or OFF/ON when we
receive the notification on buggy platform
- in the same patch, you add the documentation saying that on most
platforms, LID is reliable but some don't provide a reliable LID
state, but you guarantee to send an event when the state changes
- in userspace, we add the hwdb which says "on this particular
platform, don't rely on the actual state, but wait for events" -> this
basically removes the polling on these platforms.

Bastien, Dmitry?

I still don't like relying on userspace to actually set the SW_LID
back to open on resume, as we should not rely on some userspace
program to set the value (but if logind really wants it, it's up to
them).

Cheers,
Benjamin

>
>>
>> >> [Lv Zheng]
>> >> I think this is doable, I'll refresh my patchset to address your this
>> comment.
>> >> By inserting open/close events when next close/open event arrives
>> after a certain period,
>> >> this may fix some issues for the old programs.
>> >> Where user may be required to open/close lid twice to trigger 2nd
>> suspend.
>> >>
>> >> However, this still cannot fix the problems like "Surface 3".
>> >> We'll still need a new usage model for such platforms (no open event).
>> >
>> > No, for surface 3 you simply need to add polling of "_LID" method to the
>> > button driver.
>> >
>> > What are the other devices that mess up lid handling?
>> >
>>
>> I also would be interested in knowing how much issues you are facing
>> compared to the average number of "good" laptops. IIRC, you talked
>> about 3 (counting the Surface 3), but I believe you had more in mind.
>
> [Lv Zheng]
> Yes.
> However they happened before I started to look at the lid issues.
> I think Rui has several such experiences.
> +Rui.
>
> Thanks and best regards
> -Lv