RE: [PATCH v4 2/2] ACPI / button: Add document for ACPI control method lid device restrictions

From: Zheng, Lv
Date: Fri Jul 22 2016 - 05:39:22 EST


Hi, Benjamin

> From: Benjamin Tissoires [mailto:benjamin.tissoires@xxxxxxxxx]
> Subject: Re: [PATCH v4 2/2] ACPI / button: Add document for ACPI control
> method lid device restrictions
>
> On Fri, Jul 22, 2016 at 10:47 AM, Zheng, Lv <lv.zheng@xxxxxxxxx> wrote:
> > Hi,
> >
> >> From: Benjamin Tissoires [mailto:benjamin.tissoires@xxxxxxxxx]
> >> Subject: Re: [PATCH v4 2/2] ACPI / button: Add document for ACPI
> control
> >> method lid device restrictions
> >>
> >> On Fri, Jul 22, 2016 at 6:37 AM, Dmitry Torokhov
> >> <dmitry.torokhov@xxxxxxxxx> wrote:
> >> > Hi Lv,
> >> >
> >> > On Fri, Jul 22, 2016 at 12:24:50AM +0000, Zheng, Lv wrote:
> >> >> Hi, Dmitry
> >> >>
> >> >>
> >> >> Thanks for the review.
> >> >>
> >> >> > From: Dmitry Torokhov [mailto:dmitry.torokhov@xxxxxxxxx]
> >> >> > Subject: Re: [PATCH v4 2/2] ACPI / button: Add document for ACPI
> >> control
> >> >> > method lid device restrictions
> >> >> >
> >> >> > On Tue, Jul 19, 2016 at 04:11:21PM +0800, Lv Zheng wrote:
> >> >> > > This patch adds documentation for the usage model of the
> control
> >> >> > method lid
> >> >> > > device.
> >> >> > >
> >> >> > > Signed-off-by: Lv Zheng <lv.zheng@xxxxxxxxx>
> >> >> > > Cc: Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>
> >> >> > > Cc: Benjamin Tissoires <benjamin.tissoires@xxxxxxxxx>
> >> >> > > Cc: Bastien Nocera: <hadess@xxxxxxxxxx>
> >> >> > > Cc: linux-input@xxxxxxxxxxxxxxx
> >> >> > > ---
> >> >> > > Documentation/acpi/acpi-lid.txt | 89
> >> >> > +++++++++++++++++++++++++++++++++++++++
> >> >> > > 1 file changed, 89 insertions(+)
> >> >> > > create mode 100644 Documentation/acpi/acpi-lid.txt
> >> >> > >
> >> >> > > diff --git a/Documentation/acpi/acpi-lid.txt
> >> b/Documentation/acpi/acpi-
> >> >> > lid.txt
> >> >> > > new file mode 100644
> >> >> > > index 0000000..2addedc
> >> >> > > --- /dev/null
> >> >> > > +++ b/Documentation/acpi/acpi-lid.txt
> >> >> > > @@ -0,0 +1,89 @@
> >> >> > > +Usage Model of the ACPI Control Method Lid Device
> >> >> > > +
> >> >> > > +Copyright (C) 2016, Intel Corporation
> >> >> > > +Author: Lv Zheng <lv.zheng@xxxxxxxxx>
> >> >> > > +
> >> >> > > +
> >> >> > > +Abstract:
> >> >> > > +
> >> >> > > +Platforms containing lids convey lid state (open/close) to
> OSPMs
> >> using
> >> >> > a
> >> >> > > +control method lid device. To implement this, the AML tables
> issue
> >> >> > > +Notify(lid_device, 0x80) to notify the OSPMs whenever the lid
> >> state has
> >> >> > > +changed. The _LID control method for the lid device must be
> >> >> > implemented to
> >> >> > > +report the "current" state of the lid as either "opened" or
> "closed".
> >> >> > > +
> >> >> > > +This document describes the restrictions and the expections of
> the
> >> >> > Linux
> >> >> > > +ACPI lid device driver.
> >> >> > > +
> >> >> > > +
> >> >> > > +1. Restrictions of the returning value of the _LID control
> method
> >> >> > > +
> >> >> > > +The _LID control method is described to return the "current" lid
> >> state.
> >> >> > > +However the word of "current" has ambiguity, many AML
> tables
> >> return
> >> >> > the lid
> >> >> >
> >> >> > Can this be fixed in the next ACPI revision?
> >> >> [Lv Zheng]
> >> >> Even this is fixed in the ACPI specification, there are platforms
> already
> >> doing this.
> >> >> Especially platforms from Microsoft.
> >> >> So the de-facto standard OS won't care about the change.
> >> >> And we can still see such platforms.
> >> >>
> >> >> Here is an example from Surface 3:
> >> >>
> >> >> DefinitionBlock ("dsdt.aml", "DSDT", 2, "ALASKA", "A M I ",
> 0x01072009)
> >> >> {
> >> >> Scope (_SB)
> >> >> {
> >> >> Device (PCI0)
> >> >> {
> >> >> Name (_HID, EisaId ("PNP0A08")) // _HID: Hardware ID
> >> >> Name (_CID, EisaId ("PNP0A03")) // _CID: Compatible ID
> >> >> Device (SPI1)
> >> >> {
> >> >> Name (_HID, "8086228E") // _HID: Hardware ID
> >> >> Device (NTRG)
> >> >> {
> >> >> Name (_HID, "MSHW0037") // _HID: Hardware ID
> >> >> }
> >> >> }
> >> >> }
> >> >>
> >> >> Device (LID)
> >> >> {
> >> >> Name (_HID, EisaId ("PNP0C0D"))
> >> >> Name (LIDB, Zero)
> >> >
> >> > Start with lid closed? In any case might be wrong.
> >>
> >> Actually the initial value doesn't matter if the gpiochip triggers the
> >> _EC4 at boot, which it should
> >> (https://github.com/hadess/fedora-surface3-
> >> kernel/commit/13200f81662c1c0b58137947c3e6c000fe62a2ba,
> >> still unsubmitted)
> >>
> >> >
> >> >> Method (_LID, 0, NotSerialized)
> >> >> {
> >> >> Return (LIDB)
> >> >
> >> > So "_LID" returns the last state read by "_EC4". "_EC4" is
> >> > edge-triggered and will be evaluated every time gpio changes state.
> >>
> >> That's assuming the change happens while the system is on. If you go
> >> into suspend by closing the LID. Open it while on suspend and then hit
> >> the power button given that the system doesn't wake up when the lid
> is
> >> opened, the edge change was made while the system is asleep, and we
> >> are screwed (from an ACPI point of view). See my next comment for a
> >> solution.
> >>
> > [Lv Zheng]
> > I actually not sure if polling can fix all issues.
> > For example.
> > If a platform reporting "close" after resuming.
> > Then polling _LID will always return "close".
> > And the userspace can still get the "close" not "open".
> > So it seems polling is not working for such platforms (cached value,
> initial close).
> > Surface 3 is not the only platform caching an initial close value.
> > There are 2 traditional platforms listed in the patch description.
> >
> >> >
> >> >> }
> >> >> }
> >> >>
> >> >> Device (GPO0)
> >> >> {
> >> >> Name (_HID, "INT33FF") // _HID: Hardware ID
> >> >> OperationRegion (GPOR, GeneralPurposeIo, Zero, One)
> >> >> Field (GPOR, ByteAcc, NoLock, Preserve)
> >> >> {
> >> >> Connection (
> >> >> GpioIo (Shared, PullNone, 0x0000, 0x0000,
> >> IoRestrictionNone,
> >> >> "\\_SB.GPO0", 0x00, ResourceConsumer, ,
> >> >> )
> >> >> { // Pin list
> >> >> 0x004C
> >> >> }
> >> >> ),
> >> >> HELD, 1
> >> >
> >> > Is it possible to read state of this GPIO from userspace on startup to
> >> > correct the initial state?
> >> >
> >> >> }
> >> >> Method (_E4C, 0, Serialized)
> >> >> {
> >> >> If (LEqual(HELD, One))
> >> >> {
> >> >> Store(One, ^^LID.LIDB)
> >> >>
> >> >> There is no "open" event generated by "Surface 3".
> >> >
> >> > Right so we update the state correctly, we just forgot to send the
> >> > notification. Nothing that polling can't fix.
> >>
> >> Actually, I have a better (though more hackish) way of avoiding polling:
> >> https://github.com/hadess/fedora-surface3-
> >> kernel/blob/5e5775b9bdc308d665064387e0b144ee48e7b243/0002-
> WIP-
> >> add-custom-surface3-platform-device-for-controll.patch
> >>
> >> Given that the notification is forwarded to the touchscreen anyway, we
> >> can unregister the generic (and buggy) acpi button driver for the LID
> >> and create our own based on this specific DSDT.
> >> We can also make sure the LID state is also correct because of the WMI
> >> method which allows to read the actual value of the GPIO connected to
> >> the cover without using the cached (and most of the time wrong) acpi
> >> LID.LIDB value.
> >>
> >> I still yet have to submit this, but with this patch, but we can
> >> consider the Surface 3 as working and not an issue anymore.
> >>
> > [Lv Zheng]
> > That could make surface 3 dependent on WMI driver, not ACPI button
> driver.
> > Will this affect other buttons?
> > For example, power button/sleep button.
>
> TLDR: no, there is no impact on other buttons.
>
> There are 2 reasons why the impact is limited:
> - the patch only removes the input node that contains the LID, and it
> is the only one event in the input node
> - power/sleep, volume +/- are not handled by ACPI as this is a reduced
> platform and these buttons are not notified by ACPI. So we need an
> adaptation of the GPIO button array for it to be working (patch
> already submitted but I found a non-acpi platform issue, and then not
> enough time to fix and send an updated version).
>
> >
> > Our approach is to make ACPI button driver working.
> > Though this may lead to ABI changes.
>
> Yes, I know you want to fix ACPI button for future non working
> tablets/laptops. This is why I gave my rev-by in this series.
>
> >
> >> >
> >> >>
> >> >> }
> >> >> Else
> >> >> {
> >> >> Store(Zero, ^^LID.LIDB)
> >> >> Notify (LID, 0x80)
> >> >>
> >> >> There is only "close" event generated by "Surface 3".
> >> >> Thus they are not paired.
> >> >>
> >> >> }
> >> >> Notify (^^PCI0.SPI1.NTRG, One) // Device Check
> >> >> }
> >> >> }
> >> >> }
> >> >> }
> >> >>
> >> >> >
> >> >> > > +state upon the last lid notification instead of returning the lid
> state
> >> >> > > +upon the last _LID evaluation. There won't be difference when
> the
> >> _LID
> >> >> > > +control method is evaluated during the runtime, the problem is
> its
> >> >> > initial
> >> >> > > +returning value. When the AML tables implement this control
> >> method
> >> >> > with
> >> >> > > +cached value, the initial returning value is likely not reliable.
> There
> >> are
> >> >> > > +simply so many examples always retuning "closed" as initial lid
> >> state.
> >> >> > > +
> >> >> > > +2. Restrictions of the lid state change notifications
> >> >> > > +
> >> >> > > +There are many AML tables never notifying when the lid device
> >> state is
> >> >> > > +changed to "opened". Thus the "opened" notification is not
> >> guaranteed.
> >> >> > > +
> >> >> > > +But it is guaranteed that the AML tables always notify "closed"
> >> when
> >> >> > the
> >> >> > > +lid state is changed to "closed". The "closed" notification is
> >> normally
> >> >> > > +used to trigger some system power saving operations on
> Windows.
> >> >> > Since it is
> >> >> > > +fully tested, the "closed" notification is reliable for all AML
> tables.
> >> >> > > +
> >> >> > > +3. Expections for the userspace users of the ACPI lid device
> driver
> >> >> > > +
> >> >> > > +The ACPI button driver exports the lid state to the userspace
> via
> >> the
> >> >> > > +following file:
> >> >> > > + /proc/acpi/button/lid/LID0/state
> >> >> > > +This file actually calls the _LID control method described above.
> >> And
> >> >> > given
> >> >> > > +the previous explanation, it is not reliable enough on some
> >> platforms.
> >> >> > So
> >> >> > > +it is advised for the userspace program to not to solely rely on
> this
> >> file
> >> >> > > +to determine the actual lid state.
> >> >> > > +
> >> >> > > +The ACPI button driver emits 2 kinds of events to the user
> space:
> >> >> > > + SW_LID
> >> >> > > + When the lid state/event is reliable, the userspace can behave
> >> >> > > + according to this input switch event.
> >> >> > > + This is a mode prepared for backward compatiblity.
> >> >> > > + KEY_EVENT_OPEN/KEY_EVENT_CLOSE
> >> >> > > + When the lid state/event is not reliable, the userspace should
> >> behave
> >> >> > > + according to these 2 input key events.
> >> >> > > + New userspace programs may only be prepared for the input
> key
> >> >> > events.
> >> >> >
> >> >> > No, absolutely not. If some x86 vendors managed to mess up their
> >> >> > firmware implementations that does not mean that everyone now
> >> has to
> >> >> > abandon working perfectly well for them SW_LID events and rush
> to
> >> >> > switch
> >> >> > to a brand new event.
> >> >> [Lv Zheng]
> >> >> However there is no clear wording in the ACPI specification asking
> the
> >> vendors to achieve paired lid events.
> >> >>
> >> >> >
> >> >> > Apparently were are a few issues, main is that some systems not
> >> reporting
> >> >> > "open" event. This can be dealt with by userspace "writing" to the
> >> >> > lid's evdev device EV_SW/SW_LID/0 event upon system resume
> (and
> >> >> > startup)
> >> >> > for selected systems. This will mean that if system wakes up not
> >> because
> >> >> > LID is open we'll incorrectly assume that it is, but we can either
> add
> >> >> > more smarts to the process emitting SW_LID event or simply say
> >> "well,
> >> >> > tough, the hardware is crappy" and bug vendor to see if they can
> fix
> >> the
> >> >> > issue (if not for current firmware them for next).
> >> >> [Lv Zheng]
> >> >> The problem is there is no vendor actually caring about fixing this
> >> "issue".
> >> >> Because Windows works well with their firmware.
> >> >> Then finally becomes a big table customization business for our
> team.
> >> >
> >> > Well, OK. But you do not expect that we will redo up and down the
> stack
> >> > lid handling just because MS messed up DSDT on Surface 3? No, let
> them
> >> > know (they now care about Linux, right?) so Surface 4 works and
> quirk
> >> > the behavior for Surface 3.
> >> >
> >>
> >> From what I understood, it was more than just the Surface 3. Other
> >> laptops were having issues and Lv's team gave up on fixing those
> >> machines.
> >>
> >> >>
> >> >> >
> >> >> > As an additional workaround, we can toggle the LID switch off and
> on
> >> >> > when we get notification, much like your proposed patch does for
> the
> >> key
> >> >> > events.
> >>
> >> I really don't like this approach. The problem being that we will fix
> >> the notifications to user space, but nothing will tell userspace that
> >> the LID state is known to be wrong.
> >> OTOH, I already agreed for a hwdb in userspace so I guess this point is
> >> moot.
> >>
> >> Having both events (one SW for reliable HW, always correct, and one
> >> KEY for unreliable HW) allows userspace to make a clear distinction
> >> between the working and non working events and they can continue to
> >> keep using the polling of the SW node without extra addition.
> >>
> > [Lv Zheng]
> > I think this solution is good and fair for all of the vendors. :-)
> >
> >> Anyway, if the kernel doesn't want to (or can't) fix the actual issue
> >> (by making sure the DSDT is reliable), userspace needs to be changed
> >> so any solution will be acceptable.
> > [Lv Zheng]
> > I think the answer is "can't".
> > If we introduced too many workarounds into acpi button driver,
> > in order to make something working while the platform firmware
> doesn't expect it to be working,
> > then we'll start to worry about breaking good laptops.
>
> Then you just need to amend the documentation to say that the fallback
> of the KEY events is not the "future" but a way to get events on some
> reduced platforms and it will not be the default.
[Lv Zheng]
OK.

> Please make sure userspace knows that the default is the good SW_LID,
> and some particular cases will need to be handled through the KEY
> events, not the other way around.
[Lv Zheng]
However, we were thinking that user space should just switch to use the key events when the lid events are from ACPI button driver.
So you mean I need to change this to say that the key events should only be used for special hardware.
Right?

>
> [few thoughts later]
>
> How about:
> - you send only one patch with the SW_LID ON/OFF or OFF/ON when we
> receive the notification on buggy platform
> - in the same patch, you add the documentation saying that on most
> platforms, LID is reliable but some don't provide a reliable LID
> state, but you guarantee to send an event when the state changes

[Lv Zheng]
If I understand correctly, you mean I should improve the documentation.
Making the SW_LID the expected Linux behavior.
But allowing KEY_LID_XXX as a quirk mechanism to handle old platforms.

If so, I think I only need to refresh the document.
Right?

Cheers,
Lv

> - in userspace, we add the hwdb which says "on this particular
> platform, don't rely on the actual state, but wait for events" -> this
> basically removes the polling on these platforms.
>
> Bastien, Dmitry?
>
> I still don't like relying on userspace to actually set the SW_LID
> back to open on resume, as we should not rely on some userspace
> program to set the value (but if logind really wants it, it's up to
> them).
>
> Cheers,
> Benjamin
>
> >
> >>
> >> >> [Lv Zheng]
> >> >> I think this is doable, I'll refresh my patchset to address your this
> >> comment.
> >> >> By inserting open/close events when next close/open event arrives
> >> after a certain period,
> >> >> this may fix some issues for the old programs.
> >> >> Where user may be required to open/close lid twice to trigger 2nd
> >> suspend.
> >> >>
> >> >> However, this still cannot fix the problems like "Surface 3".
> >> >> We'll still need a new usage model for such platforms (no open
> event).
> >> >
> >> > No, for surface 3 you simply need to add polling of "_LID" method to
> the
> >> > button driver.
> >> >
> >> > What are the other devices that mess up lid handling?
> >> >
> >>
> >> I also would be interested in knowing how much issues you are facing
> >> compared to the average number of "good" laptops. IIRC, you talked
> >> about 3 (counting the Surface 3), but I believe you had more in mind.
> >
> > [Lv Zheng]
> > Yes.
> > However they happened before I started to look at the lid issues.
> > I think Rui has several such experiences.
> > +Rui.
> >
> > Thanks and best regards
> > -Lv