RE: [RFC PATCH 1/2] ACPI / button: Send "open" state after boot/resume

From: Zheng, Lv
Date: Tue May 17 2016 - 21:26:00 EST


Hi, Rafael

Thanks for the review.

> From: rjwysocki@xxxxxxxxx [mailto:rjwysocki@xxxxxxxxx] On Behalf Of
> Rafael J. Wysocki
> Sent: Wednesday, May 18, 2016 7:37 AM
> To: Zheng, Lv <lv.zheng@xxxxxxxxx>
> Cc: Wysocki, Rafael J <rafael.j.wysocki@xxxxxxxxx>; Rafael J. Wysocki
> <rjw@xxxxxxxxxxxxx>; Brown, Len <len.brown@xxxxxxxxx>; Lv Zheng
> <zetalog@xxxxxxxxx>; Linux Kernel Mailing List <linux-
> kernel@xxxxxxxxxxxxxxx>; ACPI Devel Maling List <linux-acpi@xxxxxxxxxxxxxxx>;
> Bastien Nocera: <hadess@xxxxxxxxxx>
> Subject: Re: [RFC PATCH 1/2] ACPI / button: Send "open" state after
> boot/resume
>
> On Tue, May 17, 2016 at 10:27 AM, Lv Zheng <lv.zheng@xxxxxxxxx> wrote:
> > (This patch hasn't been tested, it's sent for comments)
>
> I have a couple of concerns (see below).
>
> > Linux userspace (systemd-logind) keeps on rechecking lid state when the
> > lid state is closed. If it failed to update the lid state to open after
> > boot/resume, it could suspend the system. But as:
> > 1. Traditional ACPI platform may not generate the lid open event after
> > resuming as the open event is actually handled by the BIOS and the system
> > is then resumed from a FACS vector.
> > 2. The _LID control method's initial returning value is not reliable. The
> > _LID control method is described to return the "current" lid state,
> > however the word of "current" has ambiguity, many BIOSen return lid
> > state upon last lid notification while the developers may think the
> > BIOSen should always return the lid state upon last _LID evaluation.
> > There won't be difference when we evaluate _LID during the runtime, the
> > problem is the initial returning value of this function. When the BIOSen
> > implement this control method with cached value, the initial returning
> > value is likely not reliable.
> > Thus there is no mean for the ACPI lid driver to provide such an event
> > conveying correct current lid state. When there is no such an event or the
> > event conveys wrong result, false suspending can be examined.
> >
> > The root cause of the issue is systemd itself, it could handle the ACPI
> > control method lid device by implementing a special option like
> > LidSwitchLevelTriggered=False when it detected the ACPI lid device. However
> > there is no explicit documentation clarified the ambiguity, we need to
> > work it around in the kernel before systemd changing its mind.
>
> The above doesn't explain how the issue is addressed here.
[Lv Zheng]
The story is a bit long.
We can see several issues that some platform suspends right after boot/resume.
We noticed that on that platforms, _LID is always implemented with cached lid state returned.
And it's initial returning value may be "closed" after boot/resume.

It appears the acpi_lid_send_state() sent after boot/resume is the culprit to report the wrong lid state to the userspace.
But to our surprise, after delete the 2 lines, reporters still can see suspends after boot/resume.
That's because of systemd implementation.
It contains code logic that:
When the lid state is closed, a re-checking mechanism is installed.
So if we do not send any notification after boot/resume and the old lid state is "closed".
systemd determines to suspend in the re-checking mechanism.


>
> > Link 1: https://lkml.org/2016/3/7/460
> > Link 2: https://github.com/systemd/systemd/issues/2087
> > Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=89211
> > https://bugzilla.kernel.org/show_bug.cgi?id=106151
> > https://bugzilla.kernel.org/show_bug.cgi?id=106941
> > Signed-off-by: Lv Zheng <lv.zheng@xxxxxxxxx>
> > Cc: Bastien Nocera: <hadess@xxxxxxxxxx>
> > ---
> > drivers/acpi/button.c | 63
> +++++++++++++++++++++++++++++++++++++++++++++----
> > 1 file changed, 59 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > index 5c3b091..bb14ca5 100644
> > --- a/drivers/acpi/button.c
> > +++ b/drivers/acpi/button.c
> > @@ -53,6 +53,10 @@
> > #define ACPI_BUTTON_DEVICE_NAME_LID "Lid Switch"
> > #define ACPI_BUTTON_TYPE_LID 0x05
> >
> > +#define ACPI_BUTTON_LID_INIT_IGNORE 0x00
> > +#define ACPI_BUTTON_LID_INIT_OPEN 0x01
> > +#define ACPI_BUTTON_LID_INIT_METHOD 0x02
> > +
> > #define _COMPONENT ACPI_BUTTON_COMPONENT
> > ACPI_MODULE_NAME("button");
> >
> > @@ -105,6 +109,7 @@ struct acpi_button {
> >
> > static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> > static struct acpi_device *lid_device;
> > +static u8 lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> >
> > /* --------------------------------------------------------------------------
> > FS Interface (/proc)
> > @@ -246,7 +251,8 @@ int acpi_lid_open(void)
> > }
> > EXPORT_SYMBOL(acpi_lid_open);
> >
> > -static int acpi_lid_send_state(struct acpi_device *device)
> > +static int acpi_lid_send_state(struct acpi_device *device,
> > + bool notify_init_state)
> > {
> > struct acpi_button *button = acpi_driver_data(device);
> > unsigned long long state;
> > @@ -257,6 +263,10 @@ static int acpi_lid_send_state(struct acpi_device
> *device)
> > if (ACPI_FAILURE(status))
> > return -ENODEV;
> >
> > + if (notify_init_state &&
> > + lid_init_state == ACPI_BUTTON_LID_INIT_OPEN)
> > + state = 1;
> > +
>
> Why do we need to complicate this function?
>
> Can't we have a separate function for sending the fake "lid open" event?
[Lv Zheng]

Yes, we can.
But I put the code here for reasons.

I intentionally kept the _LID evaluation right after boot/resume.
Because I validated Windows behavior.
It seems Windows evaluates _LID right after boot.
So I kept _LID evaluated right after boot to prevent compliance issues.

>
> > /* input layer checks if event is redundant */
> > input_report_switch(button->input, SW_LID, !state);
> > input_sync(button->input);
> > @@ -278,6 +288,13 @@ static int acpi_lid_send_state(struct acpi_device
> *device)
> > return ret;
> > }
> >
> > +static int acpi_lid_send_init_state(struct acpi_device *device)
> > +{
> > + if (lid_init_state != ACPI_BUTTON_LID_INIT_IGNORE)
> > + return acpi_lid_send_state(device, true);
> > + return 0;
> > +}
> > +
> > static void acpi_button_notify(struct acpi_device *device, u32 event)
> > {
> > struct acpi_button *button = acpi_driver_data(device);
> > @@ -290,7 +307,7 @@ static void acpi_button_notify(struct acpi_device
> *device, u32 event)
> > case ACPI_BUTTON_NOTIFY_STATUS:
> > input = button->input;
> > if (button->type == ACPI_BUTTON_TYPE_LID) {
> > - acpi_lid_send_state(device);
> > + acpi_lid_send_state(device, false);
>
> I wouldn't change this code at all.
>
> > } else {
> > int keycode;
> >
> > @@ -335,7 +352,7 @@ static int acpi_button_resume(struct device *dev)
> >
> > button->suspended = false;
> > if (button->type == ACPI_BUTTON_TYPE_LID)
> > - return acpi_lid_send_state(device);
> > + return acpi_lid_send_init_state(device);
> > return 0;
> > }
> > #endif
> > @@ -416,7 +433,7 @@ static int acpi_button_add(struct acpi_device *device)
> > if (error)
> > goto err_remove_fs;
> > if (button->type == ACPI_BUTTON_TYPE_LID) {
> > - acpi_lid_send_state(device);
> > + acpi_lid_send_init_state(device);
> > /*
> > * This assumes there's only one lid device, or if there are
> > * more we only care about the last one...
> > @@ -446,4 +463,42 @@ static int acpi_button_remove(struct acpi_device
> *device)
> > return 0;
> > }
> >
> > +static int param_set_lid_init_state(const char *val, struct kernel_param *kp)
> > +{
> > + int result = 0;
> > +
> > + if (!strncmp(val, "open", sizeof("open") - 1)) {
> > + lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> > + pr_info("Notify initial lid state as open\n");
> > + } else if (!strncmp(val, "method", sizeof("method") - 1)) {
> > + lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> > + pr_info("Notify initial lid state with _LID return value\n");
> > + } else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
> > + lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
> > + pr_info("Do not notify initial lid state\n");
> > + } else
> > + result = -EINVAL;
> > + return result;
> > +}
> > +
> > +static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
> > +{
> > + switch (lid_init_state) {
> > + case ACPI_BUTTON_LID_INIT_OPEN:
> > + return sprintf(buffer, "open");
> > + case ACPI_BUTTON_LID_INIT_METHOD:
> > + return sprintf(buffer, "method");
> > + case ACPI_BUTTON_LID_INIT_IGNORE:
> > + return sprintf(buffer, "ignore");
> > + default:
> > + return sprintf(buffer, "invalid");
> > + }
> > + return 0;
> > +}
> > +
> > +module_param_call(lid_init_state,
> > + param_set_lid_init_state, param_get_lid_init_state,
> > + NULL, 0644);
> > +MODULE_PARM_DESC(lid_init_state, "Behavior for reporting LID initial
> state");
> > +
>
> I'm not seeing a particular value in having this command line switch
> to be honest. Apparently, the issue can be worked around from user
> space in any case, so why do we need one more way to work around it?
>
> > module_acpi_driver(acpi_button_driver);
> > --
>
> The main concern is general, though. Evidently, we send fake lid
> input events to user space on init and resume. I don't think this is
> a good idea, because it may confuse systems like Chrome that want to
> implement "dark resume" scenarios and may rely on input events to
> decide whether to start a UI or suspend again.
>
> Thus it might be better to simply drop the sending of those fake input
> events from the code.
[Lv Zheng]
If we did this right now, many other userspace could be broken.
So we prepared the options to allow users to choose.

Thanks and best regards
-Lv