RE: [RFC PATCH 1/2] ACPI / button: Send "open" state after boot/resume

From: Zheng, Lv
Date: Wed May 18 2016 - 21:51:08 EST


Hi,

> From: rjwysocki@xxxxxxxxx [mailto:rjwysocki@xxxxxxxxx] On Behalf Of
> Rafael J. Wysocki
> Subject: Re: [RFC PATCH 1/2] ACPI / button: Send "open" state after
> boot/resume
>
> On Wed, May 18, 2016 at 3:25 AM, Zheng, Lv <lv.zheng@xxxxxxxxx> wrote:
> > Hi, Rafael
> >
> > Thanks for the review.
> >
> >> From: rjwysocki@xxxxxxxxx [mailto:rjwysocki@xxxxxxxxx] On Behalf Of
> >> Rafael J. Wysocki
> >> Sent: Wednesday, May 18, 2016 7:37 AM
> >> To: Zheng, Lv <lv.zheng@xxxxxxxxx>
> >> Cc: Wysocki, Rafael J <rafael.j.wysocki@xxxxxxxxx>; Rafael J. Wysocki
> >> <rjw@xxxxxxxxxxxxx>; Brown, Len <len.brown@xxxxxxxxx>; Lv Zheng
> >> <zetalog@xxxxxxxxx>; Linux Kernel Mailing List <linux-
> >> kernel@xxxxxxxxxxxxxxx>; ACPI Devel Maling List <linux-
> acpi@xxxxxxxxxxxxxxx>;
> >> Bastien Nocera: <hadess@xxxxxxxxxx>
> >> Subject: Re: [RFC PATCH 1/2] ACPI / button: Send "open" state after
> >> boot/resume
> >>
> >> On Tue, May 17, 2016 at 10:27 AM, Lv Zheng <lv.zheng@xxxxxxxxx> wrote:
> >> > (This patch hasn't been tested, it's sent for comments)
> >>
> >> I have a couple of concerns (see below).
> >>
> >> > Linux userspace (systemd-logind) keeps on rechecking lid state when the
> >> > lid state is closed. If it failed to update the lid state to open after
> >> > boot/resume, it could suspend the system. But as:
> >> > 1. Traditional ACPI platform may not generate the lid open event after
> >> > resuming as the open event is actually handled by the BIOS and the
> system
> >> > is then resumed from a FACS vector.
> >> > 2. The _LID control method's initial returning value is not reliable. The
> >> > _LID control method is described to return the "current" lid state,
> >> > however the word of "current" has ambiguity, many BIOSen return lid
> >> > state upon last lid notification while the developers may think the
> >> > BIOSen should always return the lid state upon last _LID evaluation.
> >> > There won't be difference when we evaluate _LID during the runtime,
> the
> >> > problem is the initial returning value of this function. When the BIOSen
> >> > implement this control method with cached value, the initial returning
> >> > value is likely not reliable.
> >> > Thus there is no mean for the ACPI lid driver to provide such an event
> >> > conveying correct current lid state. When there is no such an event or the
> >> > event conveys wrong result, false suspending can be examined.
> >> >
> >> > The root cause of the issue is systemd itself, it could handle the ACPI
> >> > control method lid device by implementing a special option like
> >> > LidSwitchLevelTriggered=False when it detected the ACPI lid device.
> However
> >> > there is no explicit documentation clarified the ambiguity, we need to
> >> > work it around in the kernel before systemd changing its mind.
> >>
> >> The above doesn't explain how the issue is addressed here.
> > [Lv Zheng]
> > The story is a bit long.
> > We can see several issues that some platform suspends right after
> boot/resume.
> > We noticed that on that platforms, _LID is always implemented with cached
> lid state returned.
> > And it's initial returning value may be "closed" after boot/resume.
> >
> > It appears the acpi_lid_send_state() sent after boot/resume is the culprit to
> report the wrong lid state to the userspace.
> > But to our surprise, after delete the 2 lines, reporters still can see suspends
> after boot/resume.
> > That's because of systemd implementation.
> > It contains code logic that:
> > When the lid state is closed, a re-checking mechanism is installed.
> > So if we do not send any notification after boot/resume and the old lid state
> is "closed".
> > systemd determines to suspend in the re-checking mechanism.
>
> If that really is the case, it is plain silly and I don't think we can
> do anything in the kernel to help here.
[Lv Zheng]
The problem is:
If we just removed the 2 lines sending wrong lid state after boot/resume.
Problem couldn't be solved.
It could only be solved by changing both the systemd and the kernel (deleting the 2 lines).

>
> >>
> >> > Link 1: https://lkml.org/2016/3/7/460
> >> > Link 2: https://github.com/systemd/systemd/issues/2087
> >> > Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=89211
> >> > https://bugzilla.kernel.org/show_bug.cgi?id=106151
> >> > https://bugzilla.kernel.org/show_bug.cgi?id=106941
> >> > Signed-off-by: Lv Zheng <lv.zheng@xxxxxxxxx>
> >> > Cc: Bastien Nocera: <hadess@xxxxxxxxxx>
> >> > ---
> >> > drivers/acpi/button.c | 63
> >> +++++++++++++++++++++++++++++++++++++++++++++----
> >> > 1 file changed, 59 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> >> > index 5c3b091..bb14ca5 100644
> >> > --- a/drivers/acpi/button.c
> >> > +++ b/drivers/acpi/button.c
> >> > @@ -53,6 +53,10 @@
> >> > #define ACPI_BUTTON_DEVICE_NAME_LID "Lid Switch"
> >> > #define ACPI_BUTTON_TYPE_LID 0x05
> >> >
> >> > +#define ACPI_BUTTON_LID_INIT_IGNORE 0x00
> >> > +#define ACPI_BUTTON_LID_INIT_OPEN 0x01
> >> > +#define ACPI_BUTTON_LID_INIT_METHOD 0x02
> >> > +
> >> > #define _COMPONENT ACPI_BUTTON_COMPONENT
> >> > ACPI_MODULE_NAME("button");
> >> >
> >> > @@ -105,6 +109,7 @@ struct acpi_button {
> >> >
> >> > static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> >> > static struct acpi_device *lid_device;
> >> > +static u8 lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> >> >
> >> > /* --------------------------------------------------------------------------
> >> > FS Interface (/proc)
> >> > @@ -246,7 +251,8 @@ int acpi_lid_open(void)
> >> > }
> >> > EXPORT_SYMBOL(acpi_lid_open);
> >> >
> >> > -static int acpi_lid_send_state(struct acpi_device *device)
> >> > +static int acpi_lid_send_state(struct acpi_device *device,
> >> > + bool notify_init_state)
> >> > {
> >> > struct acpi_button *button = acpi_driver_data(device);
> >> > unsigned long long state;
> >> > @@ -257,6 +263,10 @@ static int acpi_lid_send_state(struct acpi_device
> >> *device)
> >> > if (ACPI_FAILURE(status))
> >> > return -ENODEV;
> >> >
> >> > + if (notify_init_state &&
> >> > + lid_init_state == ACPI_BUTTON_LID_INIT_OPEN)
> >> > + state = 1;
> >> > +
> >>
> >> Why do we need to complicate this function?
> >>
> >> Can't we have a separate function for sending the fake "lid open" event?
> > [Lv Zheng]
> >
> > Yes, we can.
> > But I put the code here for reasons.
> >
> > I intentionally kept the _LID evaluation right after boot/resume.
> > Because I validated Windows behavior.
> > It seems Windows evaluates _LID right after boot.
> > So I kept _LID evaluated right after boot to prevent compliance issues.
>
> I don't quite see what compliance issues could result from skipping
> the _LID evaluation after boot.
[Lv Zheng]
I'm not sure if there is a platform putting named object initialization code in _LID.
If you don't like it, we can stop evaluating _LID in the next version.

>
> >>
> >> > /* input layer checks if event is redundant */
> >> > input_report_switch(button->input, SW_LID, !state);
> >> > input_sync(button->input);
> >> > @@ -278,6 +288,13 @@ static int acpi_lid_send_state(struct acpi_device
> >> *device)
> >> > return ret;
> >> > }
> >> >
> >> > +static int acpi_lid_send_init_state(struct acpi_device *device)
> >> > +{
> >> > + if (lid_init_state != ACPI_BUTTON_LID_INIT_IGNORE)
> >> > + return acpi_lid_send_state(device, true);
> >> > + return 0;
> >> > +}
> >> > +
> >> > static void acpi_button_notify(struct acpi_device *device, u32 event)
> >> > {
> >> > struct acpi_button *button = acpi_driver_data(device);
> >> > @@ -290,7 +307,7 @@ static void acpi_button_notify(struct acpi_device
> >> *device, u32 event)
> >> > case ACPI_BUTTON_NOTIFY_STATUS:
> >> > input = button->input;
> >> > if (button->type == ACPI_BUTTON_TYPE_LID) {
> >> > - acpi_lid_send_state(device);
> >> > + acpi_lid_send_state(device, false);
> >>
> >> I wouldn't change this code at all.
> >>
> >> > } else {
> >> > int keycode;
> >> >
> >> > @@ -335,7 +352,7 @@ static int acpi_button_resume(struct device *dev)
> >> >
> >> > button->suspended = false;
> >> > if (button->type == ACPI_BUTTON_TYPE_LID)
> >> > - return acpi_lid_send_state(device);
> >> > + return acpi_lid_send_init_state(device);
> >> > return 0;
> >> > }
> >> > #endif
> >> > @@ -416,7 +433,7 @@ static int acpi_button_add(struct acpi_device
> *device)
> >> > if (error)
> >> > goto err_remove_fs;
> >> > if (button->type == ACPI_BUTTON_TYPE_LID) {
> >> > - acpi_lid_send_state(device);
> >> > + acpi_lid_send_init_state(device);
> >> > /*
> >> > * This assumes there's only one lid device, or if there are
> >> > * more we only care about the last one...
> >> > @@ -446,4 +463,42 @@ static int acpi_button_remove(struct acpi_device
> >> *device)
> >> > return 0;
> >> > }
> >> >
> >> > +static int param_set_lid_init_state(const char *val, struct kernel_param
> *kp)
> >> > +{
> >> > + int result = 0;
> >> > +
> >> > + if (!strncmp(val, "open", sizeof("open") - 1)) {
> >> > + lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> >> > + pr_info("Notify initial lid state as open\n");
> >> > + } else if (!strncmp(val, "method", sizeof("method") - 1)) {
> >> > + lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> >> > + pr_info("Notify initial lid state with _LID return value\n");
> >> > + } else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
> >> > + lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
> >> > + pr_info("Do not notify initial lid state\n");
> >> > + } else
> >> > + result = -EINVAL;
> >> > + return result;
> >> > +}
> >> > +
> >> > +static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
> >> > +{
> >> > + switch (lid_init_state) {
> >> > + case ACPI_BUTTON_LID_INIT_OPEN:
> >> > + return sprintf(buffer, "open");
> >> > + case ACPI_BUTTON_LID_INIT_METHOD:
> >> > + return sprintf(buffer, "method");
> >> > + case ACPI_BUTTON_LID_INIT_IGNORE:
> >> > + return sprintf(buffer, "ignore");
> >> > + default:
> >> > + return sprintf(buffer, "invalid");
> >> > + }
> >> > + return 0;
> >> > +}
> >> > +
> >> > +module_param_call(lid_init_state,
> >> > + param_set_lid_init_state, param_get_lid_init_state,
> >> > + NULL, 0644);
> >> > +MODULE_PARM_DESC(lid_init_state, "Behavior for reporting LID initial
> >> state");
> >> > +
> >>
> >> I'm not seeing a particular value in having this command line switch
> >> to be honest. Apparently, the issue can be worked around from user
> >> space in any case, so why do we need one more way to work around it?
> >>
> >> > module_acpi_driver(acpi_button_driver);
> >> > --
> >>
> >> The main concern is general, though. Evidently, we send fake lid
> >> input events to user space on init and resume. I don't think this is
> >> a good idea, because it may confuse systems like Chrome that want to
> >> implement "dark resume" scenarios and may rely on input events to
> >> decide whether to start a UI or suspend again.
> >>
> >> Thus it might be better to simply drop the sending of those fake input
> >> events from the code.
> > [Lv Zheng]
> > If we did this right now, many other userspace could be broken.
> > So we prepared the options to allow users to choose.
>
> Do we have any evidence that any other user space stacks are affected?
[Lv Zheng]
I didn't know any of such affections except the systemd.

Thanks and best regards
-Lv