RE: [PATCH] ACPI / button: Avoid using broken _LID on Surface tablet

From: Zheng, Lv
Date: Thu Mar 03 2016 - 22:37:23 EST


Hi,

> From: Bastien Nocera [mailto:hadess@xxxxxxxxxx]
> Subject: Re: [PATCH] ACPI / button: Avoid using broken _LID on Surface tablet
>
> On Mon, 2016-02-22 at 07:24 +0000, Zheng, Lv wrote:
> > [Lv Zheng]
> > According to the knowledge of various BIOS _LID implementation, Linux
> > button driver doesn't seem to be working wrong.
> > It in fact works correctly and is able to handle all BIOS _LID
> > implementations.
> >
> > So this looks to me like a user space bug.
>
> I don't think it is one, given the kernel API for that functionality.
> If, instead of exposing the lid as an input switch, the kernel only
> ever sent an event saying "the lid is now closed" then we wouldn't have
> that problem, as we'd likely expect the lid to be opened when starting
> the machine (if present).
[Lv Zheng]
Let me show you several proofs of Windows behaviors.

1. Have you ever seen a screen on Windows that can tell you the status of LID.
If you have seen such screen, what the status is when the system is booted.
I have never seen such a UI, so Windows in fact cares no current LID status.

2. Have you ever seen a screen on Windows that allow users to specify open LID actions?
The answer is no.
On traditional x86 platforms, system is woken up via wake GPEs.
So the open event in fact is handled by BIOS and Windows cares nothing about it.

3. Have you ever seen a screen on Windows that allow users to specify close LID actions?
Yes, I have, it is in power options.
So Windows in fact only cares about the LID close notification and will only check LID status when it receives this notification.

The above facts can make clear about what BIOS will implement for the ACPI LID.

And from this point of view, in kernel ACPI LID driver, only this commit is wrong:
commit 23de5d9ef2a4bbc4f733f58311bcb7cf6239c813
Author: Alexey Starikovskiy <astarikovskiy@xxxxxxx>
Subject: ACPI: button: send initial lid state after add and resume
This seems to be a false root caused bug fix:
https://bugzilla.novell.com/show_bug.cgi?id=326814
I guess the original bug was in the EC driver or somewhere else.

But to our surprises, after reverting this commit.
We can see:
1. Kernel doesn't trigger any event to the userspace.
2. But we still can see that Surface is automatically suspended every several seconds.

According to the Bugzilla investigation, the suspend was triggered by systemd.
I thus just concluded that this could be a userspace bug.
But maybe this is a systemd behavior triggered by some kernel events that are not captured by our debugging patches.
Let the reporter and the developer do more investigation there.

I just raised my concern about this fix.
The fix brought by this patch is apparently wrong according to the above conclusions.

>
> > That the user space doesn't contain a proper mode to handle ACPI BIOS
> > _LID implementations.
> >
> > Why should kernel work this gap around?
> > The gap is:
> > The user space requires lid device to act in the way it wants.
>
> In the way that it's always worked up until now, rather.
[Lv Zheng]
This is not persuasive.
Things can appear to be working with workarounds applied.
During my career, I have seen plenty of such cases.
And Iâve proven that there are many such kind of wrongly root caused fixes in ACPI subsystem in this community during the past years.

>
> > While the ACPI BIOS only provides lid behavior that is working for
> > Windows power-saving settings.
> > IMO,
> > 1. Windows Only requires LID close notifications to trigger power-
> > save action and only evaluates _LID after receiving a LID
> > notification,
> > ÂÂÂ BIOSen only ensure _LID returning value is correct after a
> > notification.
>
> In which case, expecting the lid to always be opened on startup would
> probably be a fair assumption, no?
>
[Lv Zheng]
The answer should be no here.

If we make system default LID status as open on boot/resume, then what about this case?

According to our tests, we can reboot/resume surface with LID closed into a running Windows.
So if we force the LID status to be opened after the reboot/resume, then this also seems to be a conflict against the real status.
What the benefit to force such a wrong status, given that we are all not obsessives...

> > 2. But Linux user space requires all LID notifications to be
> > instantly/correctly reported and wants to know the exact LID states
> > whenever it reads the states from the sysfs.
> > It doesn't seem to be possible for the kernel to work between BIOSen
> > and the user space to fill such a gap.
>
> If you quirk the kernel lid handling to always be opened on startup,
> and we waited until the first event to correct it if necessary, seems
> like the easiest way to go.
>
> But that brings me the question of how Windows (and then Linux) behave
> when you've booted your laptop and closed the lid straight away, before
> any driver in the OS had the chance to be around to see the
> notification?
>
[Lv Zheng]
If we booted the laptop with LID opened, and close it right before the Windows is fully booted.
We can see the laptop booted into a running Windows.

> It also brings the question of how Windows will know that the lid is
> still closed when coming out of suspend by pressing the power button,
> which can happen depending on the hardware design (it's certainly
> possible to press the power button with the lid closed on the Surface
> 3, and likely other Surfaces).
>
[Lv Zheng]
If we booted/resumed the laptop with LID closed by pressing the power button.
We can see the laptop booted/resumed into a running Windows.

> I'm not happy about the "fix" for this problem either, but blaming
> user-space for this is harsh, given the API exported by the kernel and
> how pretty much every laptop worked.
>
[Lv Zheng]
It worked for so many years on top of traditional x86 laptops where the LID open event is handled by the BIOS to trigger a wake GPE.
Now we are talking about runtime idle systems where resuming from the s2idle invokes no BIOS code.
The new case is still working on Windows.
Which forces us to re-consider the kernel API of the ACPI LID.

Cheers
-Lv