RE: [PATCH 1/1] ishtp: Add support for Intel ishtp eclite driver

From: K Naduvalath, Sumesh
Date: Thu Jun 03 2021 - 12:48:23 EST


Thank you Hans for the scrutiny. Please find my comments inline.

> -----Original Message-----
> From: Hans de Goede <hdegoede@xxxxxxxxxx>
> Sent: Wednesday, June 2, 2021 4:49 PM
> To: K Naduvalath, Sumesh <sumesh.k.naduvalath@xxxxxxxxx>;
> mgross@xxxxxxxxxxxxxxx; srinivas.pandruvada@xxxxxxxxxxxxxxx
> Cc: Pandruvada, Srinivas <srinivas.pandruvada@xxxxxxxxx>; platform-driver-
> x86@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Chinnu, Ganapathi
> <ganapathi.chinnu@xxxxxxxxx>; Kumar, Nachiketa
> <nachiketa.kumar@xxxxxxxxx>
> Subject: Re: [PATCH 1/1] ishtp: Add support for Intel ishtp eclite driver
>
> Hi Sumesh,
>
> On 6/1/21 8:24 PM, K Naduvalath, Sumesh wrote:
> > Thank you Hans for the review comments. Please find the reply inline -
> >
> >> -----Original Message-----
> >> From: Hans de Goede <hdegoede@xxxxxxxxxx>
> >> Sent: Monday, May 31, 2021 8:25 PM
> >> To: K Naduvalath, Sumesh <sumesh.k.naduvalath@xxxxxxxxx>;
> >> mgross@xxxxxxxxxxxxxxx; srinivas.pandruvada@xxxxxxxxxxxxxxx
> >> Cc: Pandruvada, Srinivas <srinivas.pandruvada@xxxxxxxxx>;
> >> platform-driver- x86@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >> Chinnu, Ganapathi <ganapathi.chinnu@xxxxxxxxx>; Kumar, Nachiketa
> >> <nachiketa.kumar@xxxxxxxxx>
> >> Subject: Re: [PATCH 1/1] ishtp: Add support for Intel ishtp eclite
> >> driver
> >>
> >> Hi,
> >>
> >> Thank you, I've done a quick review, which has already spotted quite
> >> a few issues. Note I will probably do a more thorough review later
> >> which mind find more issues, but lets start with fixing the serious
> >> issues which this review has found.
>
> <snip>
>
> >>> diff --git a/drivers/platform/x86/Kconfig
> >>> b/drivers/platform/x86/Kconfig index 60592fb88e7a..cfa2cb150909
> >>> 100644
> >>> --- a/drivers/platform/x86/Kconfig
> >>> +++ b/drivers/platform/x86/Kconfig
> >>> @@ -1180,6 +1180,19 @@ config INTEL_CHTDC_TI_PWRBTN
> >>> To compile this driver as a module, choose M here: the module
> >>> will be called intel_chtdc_ti_pwrbtn.
> >>>
> >>> +config INTEL_ISHTP_ECLITE
> >>> + tristate "Intel ISHTP eclite controller"
> >>> + depends on INTEL_ISH_HID
> >>> + depends on ACPI
> >>> + help
> >>> + This driver is for accessing the PSE(Programmable Service Engine),
> >>> + an Embedded Controller like IP, using ISHTP(Integratd Sensor Hub
> >>> + Transport Protocol) to get battery, thermal and UCSI (USB Type-C
> >>> + Connector System Software Interface) related data from
> >>> +the
> >> platform.
> >>
> >> This text has all the same issues as the commit message. Also please
> >> explain on what sort of systems this functionality is typically
> >> found/used so that users will have a better idea if this is something
> >> which they should enable on their systems.
> >>
> >
> > This functionality is enabled for the first time for Intel's Elkhartlake
> platform.
> > Users who don't want to use discrete Embedded Controller on their
> > platform, they can leverage the integrated solution of ECLite which is
> > part of Elkhartlake's PSE subsystem. I'll add more text for the config item.
>
> Thank you for the info.
>
> <snip>
>
> >>> +static acpi_status
> >>> +ecl_opregion_cmd_handler(u32 function, acpi_physical_address
> address,
> >>> + u32 bits, u64 *value64,
> >>> + void *handler_context, void *region_context) {
> >>> + struct ishtp_opregion_dev *opr_dev;
> >>> + struct opregion_cmd *cmd;
> >>> +
> >>> + if (!region_context || !value64)
> >>> + return AE_BAD_PARAMETER;
> >>> +
> >>> + if (function == ACPI_READ)
> >>> + return AE_ERROR;
> >>> +
> >>> + opr_dev = (struct ishtp_opregion_dev *)region_context;
> >>> + cmd = &opr_dev->opr_context.cmd_area;
> >>
> >> This is shared between all threads, we have had issues with sharing
> >> opregion context between threads like this in the past.
> >>
> >> What is stopping ACPI code from trying to use the opr_context from
> >> multiple threads at the same time, messing things up?
> >>
> >
> > This will not happen. BIOS calls ACPI methods (cmd and data in this
> > driver) in a SERIALIZED manner. BIOS issues another call only after finishing
> the first one.
>
> You say that this will not happen, but the problem which I have with the
> OpRegion API is that this may actually very well happen. There are no
> guarantees of this not happening. We are relying on the AML code from the
> BIOS adding the "Serialized" attribute to all functions accessing the OpRegion,
> all that is necessary is one bug in the AML code and then this will happen.
>
> And it is typically very hard to get vendors to issue BIOS updates to fix things
> like this :|
>
> I must honestly say that I find the whole design of the OpRegion API lacking.
> There are other options to interface AML code, like e.g. modelling the ISHTP
> as a generic serial bus, then a single OpRegion access can do the
> following:
>
> 1. AML fills a buffer
> 2. OpRegion call processes buffer and writes back results (status code +
> data read if it is a read) to buffer
> 3. AML processes buffer
>
> This is e.g. used in some Microsoft Surface 3 devices, see:
> drivers/platform/surface/surface3_power.c
>
> This would make the calls more-or-less atomic, removing the need for the
> ACPI Methods to all be Serialized.
>
> This would also allow the Opregion to return an error status code when the
> EClite is not ready, instead of unregistering + reregistering the OpRegion,
> which in itself is another source of possible races (see below).
>
> I assume that it is too late to change the OpRegion API now, so that we are
> stuck with this ?
>
> I must say I'm disappointed about the quality of the OpRegion API design
> here. APIs should be designed so that it is easy / natural to use them in a
> correct way, where as this API seems to be dessgned in such a way that it is
> easy to use it the wrong way.
>
> I really expect Intel to do better the next time the introduce a new OpRegion
> for something. It would help greatly if Intel would send some rough sketches
> of how the API is going to look like to the mailinglist, so that we can actually
> correct issues like these, rather then the API already being set in stone and
> that we end up having to live with a not so good API.
>
I understand your concern and will certainly take this feedback for the next gen/integration
of the product. Similar approach is being discussed already for nextgen.
But timeline is a concern now here for this.

> >> I'm especially worried about the offset + data_len used in various
> >> places, even if we add checks for this, this could be changed
> >> underneath us by another thread.
> >
> > There are checks in BIOS for offset + data, but I'll add them in the driver as
> well.
> > There is no other thread accessing it. Flow below -
> >
> > ACPI method --> cmd -->
> > <--end ACPI method <--
> >
> > <here no other ACPI method will execute because of serialized method>
> >
> > ACPI method --> data-->
> > <--end ACPI method <--
> >
> >>
> >> You should add a mutex to the opr_dev and lock it in this function so
> >> that the cmd struct cannot be changed underneath us while we are
> processing it.
> >>
> >> Note this does not fully protect against races like this:
> >>
> >> 1. Thread a sets offset
> >> 2. Thread b sets a different offset
> >> 3. Thread a writes ECL_ISH_READ to command.
> >
> > These race conditions won't occur. No structure elements are used from
> > two paths simultaneously. Only element from the opregion structure
> > used outside ACPI path is
> >
> > opr_dev->dsm_event_id from event_work workqueque.
> >
> > But this element not accessed from anywhere else including serialized ACPI
> path.
>
> I understand that the intention is for all the ACPI calls to be serialized, but
> nothing is guaranteeing this. All it requires is one method not being marked
> as serialized...
>

I get your point here too. I'll fix in v2.

> >> This will result in thread a getting the data at the offset specified
> >> by thread b, rather then at the offset which it requested. But there
> >> is nothing we can do about that, that needs to be solved with
> synchronization at the AML level.
> >
> > There is ASL serialization defined, But are you suggesting to put
> > locks for the fairness of coding?
>
> I don't want the kernel to read / write beyond the end of kernel-owned
> memory because the following happens (this assumes one AML method is
> missing the Serialized
> attribute):
>
> 1. #define ECL_DATA_OPR_BUFLEN 384
> 2. Thread A sets offset to 0
> 3. Thread A sets len to 100
> 4. Thread A writes ECL_ISH_WRITE
> 5. Thread A kernel checks (offset + len) < ECL_DATA_OPR_BUFLEN, this is ok
> 6. Thread B sets offset to 300 7. Thread A kernel does
> memcpy(message.payload, opr_dev->opr_context.data_area.data +
> message.header.offset, message.header.data_len);
>
> 7. Will now end up reading 100 bytes from offset 300 of
> opr_context.data_area.data, which is only 384 bytes big, which needless to
> say is not good.
> I see now that the offset + len are copied into message.header and then the
> copied values are used. It is tempting to think that this thus can be fixed by
> doing a check on the copied values, but the message struct is a local variable,
> so the compiler will alias the 2 values and might decide to do the check on the
> originals; and the compiler is also free to re-order things. So this making of a
> local copy does not help.
>
> This is a classic time-of-check vs time-of-use problem. Since we cannot
> _guarantee_ that all users of the OpRegion are properly using the ACPI
> Serialized Method attribute, we need some other mechanism to ensure that
> len + offset do not change between checking and consuming them. The most
> straight forward solution here is to protect opr_context with a mutex so that
> it cannot be changed by another thread while one call is in progress. Since all
> accesses should be serialized at the AML level anyways this will not add extra
> serialization, while it will help avoid AML bugs turning into out-of-bounds
> memory accesses inside the kernel.
>

Sure. Mutex and length checks will be added in v2.

> <snip>
>
> >>> +static acpi_status
> >>> +ecl_opregion_data_handler(u32 function, acpi_physical_address
> address,
> >>> + u32 bits, u64 *value64,
> >>> + void *handler_context, void *region_context) {
> >>> + struct ishtp_opregion_dev *opr_dev;
> >>> + unsigned int bytes = BITS_TO_BYTES(bits);
> >>> + void *data_addr;
> >>> +
> >>> + if (!region_context || !value64)
> >>> + return AE_BAD_PARAMETER;
> >>> +
> >>> + if (address + bytes > ECL_DATA_OPR_BUFLEN)
> >>> + return AE_BAD_PARAMETER;
> >>> +
> >>> + opr_dev = (struct ishtp_opregion_dev *)region_context;
> >>> + data_addr = &opr_dev->opr_context.data_area.data[address];
> >>> +
> >>> + if (function == ACPI_READ)
> >>> + memcpy(value64, data_addr, bytes);
> >>> + else if (function == ACPI_WRITE)
> >>> + memcpy(data_addr, value64, bytes);
> >>
> >> What if bits is not a multiple of 8? Then we have just overwritten a
> >> bunch of bits which the caller did not request us to do.
> >>
> >> Since the caller specifies bits, this should really do a
> >> read-modify-write of the last byte when there are any "leftover"
> >> bits. ? What does the documentation say about this?
> >
> > The request will always be multiple of 8 bits as per ASL definition/docs.
>
> Ok.
>
>
> >>> + else
> >>> + return AE_BAD_PARAMETER;
> >>> +
> >>> + return AE_OK;
> >>> +}
> >>> +
> >>> +static int acpi_opregion_init(struct ishtp_opregion_dev *opr_dev) {
> >>> + acpi_status status;
> >>> + struct acpi_device *adev;
> >>> +
> >>> + /* Find ECLite device and install opregion handlers */
> >>> + adev = acpi_dev_get_first_match_dev("INTC1035", NULL, -1);
> >>> + if (!adev) {
> >>> + dev_err(cl_data_to_dev(opr_dev), "eclite ACPI device not
> >> found\n");
> >>> + return -ENODEV;
> >>> + }
> >>> +
> >>> + opr_dev->acpi_handle = adev->handle;
> >>
> >> acpi_opregion_init() seem to get called on every resume, doing the
> >> lookup is only necessary once, after that the cached value in
> >> opr_dev->acpi_handle can be reused.
> >>
> >> More importantly this whole dance of unregistering + re-registering
> >> the opregion seems unnecessary. You already have ish_link_ready in
> >> case the opregion gets called before things are ready; and if the
> >> opregion is called when the link is not ready, still having the
> >> opregion handler in place allows you to log a sensible error about what is
> going on which is what we want.
> >
> > Initial approach was same as you suggested ( No uninstall, just wait
> > in ACPI Method). But after every resume, the driver gets ACPI write
> > and read requests for FAN and thermal controls which fails because
> > link is not ready. Also we can't wait_event_interruptible_timeout() in
> > ACPI method until we get the link_ready( link can become ready much
> > later until PSE fully boots up after every Sx. Anything else gets executed
> after this wait_event timeout fail in ACPI method.
> > We can't afford to miss any critical thermal/FAN related setting from
> > standby/ hibernation resume. No requests are missed by registering
> > opregion only after link_ready.
>
> I see; and since the OpRegion API does not allow returning a status code to
> communicate the link-not-ready thing you need to do this unregister + re-
> register dance instead. See this is what I meant above when I wrote that the
> whole OpRegion API seems lacking. This could all have been solved by having
> the OpRegion calls communicate back a status code.
>
> Now you are relying on communicating the availability of the link by calling
> _REG instead. This means that we better hope that all users of the OpRegion
> will correctly check the flag set by _REG, which in my experience with a lot of
> AML code is something which is often forgotten, so this will be another
> source of DSDT/AML bugs for AML code using this OpRegion <sigh>.
>
> Also note that this is racy too, on suspend we unregister the OpRegion,
> calling _REG in the process. But _REG methods typically are not Serialized so
> now the following may happen:
>
> 1. Thread A is executing a function using the EClite OpRegion 2. Thread A
> checks the flag set by _REG, sees the region is available 3. Thread B is
> executing suspend for the EClite device, unregister the
> OpRegion, calling _REG to clear the flag 4. Thread A continues with actually
> accessing the no longer available
> OPRegion leading to an error because there is no Opregion handler
> registered.
>
> So this means that _REG should be marked Serialized for this device, I
> wonder if it actually is marked as such in the DSDT of your test hardware ?
>
> I think we might get away with this regardless because 3 probably takes a rw-
> lock on the ACPI namespace which it can only get if no other threads are
> executing any AML code (I'm speculating here I did not check), but this is yet
> another example about how this entire approach is less then ideal.
>

I'll check again and come back on this with a better approach to protect from race.
I'll incorporate the changes in v2 with better solution.
And sure, these points are taken as positive feedback. Thanks.

> <snip>
>
> >>> +static void ecl_ishtp_cl_deinit(struct ishtp_cl *ecl_ishtp_cl) {
> >>> + ishtp_cl_unlink(ecl_ishtp_cl);
> >>> + ishtp_cl_flush_queues(ecl_ishtp_cl);
> >>> + ishtp_cl_free(ecl_ishtp_cl);
> >>> +}
> >>> +
> >>> +static void ecl_ishtp_cl_reset_handler(struct work_struct *work) {
> >>> + struct ishtp_opregion_dev *opr_dev;
> >>> + struct ishtp_cl_device *cl_device;
> >>> + struct ishtp_cl *ecl_ishtp_cl;
> >>> + int rv;
> >>> + int retry;
> >>> +
> >>> + opr_dev = container_of(work, struct ishtp_opregion_dev,
> >> reset_work);
> >>> +
> >>> + opr_dev->ish_link_ready = false;
> >>> +
> >>> + cl_device = opr_dev->cl_device;
> >>> + ecl_ishtp_cl = opr_dev->ecl_ishtp_cl;
> >>> +
> >>> + ecl_ishtp_cl_deinit(ecl_ishtp_cl);
> >>> +
> >>> + ecl_ishtp_cl = ishtp_cl_allocate(cl_device);
> >>> + if (!ecl_ishtp_cl)
> >>> + return;
> >>
> >> Is this whole freeing + re-alloc of the ISHTP client here really
> >> necessary ? This seems like overkill.
> >
> > This is required. reset is an asynchronous notification from ISH (PSE
> > in this case) firmware and current connection becomes stale and needs
> > to be reinitialized. All ISHTP client drivers are implemented same way.
> > eg.
> > drivers/hid/intel-ish-hid/ishtp-hid-client.c
> > drivers/hid/intel-ish-hid/ishtp-fw-loader.c
>
> Ok.
>
> <snip>
>
> >>> +static int ecl_ishtp_cl_probe(struct ishtp_cl_device *cl_device) {
> >>> + struct ishtp_cl *ecl_ishtp_cl;
> >>> + struct ishtp_opregion_dev *opr_dev;
> >>> + int rv;
> >>> +
> >>> + opr_dev = devm_kzalloc(ishtp_device(cl_device), sizeof(*opr_dev),
> >>> + GFP_KERNEL);
> >>> + if (!opr_dev)
> >>> + return -ENOMEM;
> >>> +
> >>> + ecl_ishtp_cl = ishtp_cl_allocate(cl_device);
> >>> + if (!ecl_ishtp_cl)
> >>> + return -ENOMEM;
> >>> +
> >>> + ishtp_set_drvdata(cl_device, ecl_ishtp_cl);
> >>> + ishtp_set_client_data(ecl_ishtp_cl, opr_dev);
> >>> + opr_dev->ecl_ishtp_cl = ecl_ishtp_cl;
> >>> + opr_dev->cl_device = cl_device;
> >>> +
> >>> + init_waitqueue_head(&opr_dev->read_wait);
> >>> + INIT_WORK(&opr_dev->event_work, ecl_acpi_invoke_dsm);
> >>> + INIT_WORK(&opr_dev->reset_work, ecl_ishtp_cl_reset_handler);
> >>> +
> >>> + /* Initialize ish client device */
> >>> + rv = ecl_ishtp_cl_init(ecl_ishtp_cl);
> >>> + if (rv) {
> >>> + dev_err(cl_data_to_dev(opr_dev), "Client init failed\n");
> >>> + goto err_exit;
> >>> + }
> >>> +
> >>> + dev_dbg(cl_data_to_dev(opr_dev), "eclite-ishtp client
> >>> +initialised\n");
> >>> +
> >>> + /* Register a handler for eclite fw events */
> >>> + ishtp_register_event_cb(cl_device, ecl_ishtp_cl_event_cb);
> >>> +
> >>> + ishtp_get_device(cl_device);
> >>
> >> This seems weird, normally in the device-model a driver being bound
> >> to a device guarantees that that device cannot go away before the
> >> remove callback of the driver is called.
> >>
> >> So it seems to me that this call + the put in both the err_exit and
> >> ecl_ishtp_cl_remove() cases can be dropped.
> >>
> > This platform driver has two interfaces - a)ISH and b)ACPI.
> > ISH initializes first and if successful we ishtp_get_device().
> > Then ACPI initializes after that. If ACPI init fails, driver gets
> > cleaned with ISHTP as well thus ishtp_get_device() in probe's
> > err_exit. Both interface must be Required and initialized for the
> functionality.
> >
> > Do you still see a problem?
>
> I don't see a problem, but I don't think that the ishtp_get_device() here and
> the 2 ishtp_put_device() calls below are necessary.
>
> This is the probe function for a ishtp_cl_driver as long as that driver is bound
> to the cl_device (which gets passed as parameter to the probe()) the
> cl_device can never go away, not until the matching remove function from
> the ishtp_cl_driver has been called, so the get at probe + the put at remove
> are not necessary, since the cl_device already must stick around for as long
> as the driver is bound.
>
> Another way of looking at this is that the linux device-model/driver core
> already does a get when it binds a driver to the device (and a put on
> remove).
>
Ok. I'll cross check with ish gurus if there is any concern with this.
I'll remove them if there isn't any.
>
> >>> +
> >>> + opr_dev->ish_link_ready = true;
> >>> +
> >>> + /* Now find ACPI device and init opregion handlers */
> >>> + rv = acpi_opregion_init(opr_dev);
> >>> + if (rv) {
> >>> + dev_err(cl_data_to_dev(opr_dev), "ACPI opregion init
> >> failed\n");
> >>> +
> >>> + goto err_exit;
> >>> + }
> >>> +
> >>> + /* Reprobe devices depending on ECLite - battery, fan, etc. */
> >>> + acpi_walk_dep_device_list(opr_dev->acpi_handle);
> >>> +
> >>> + return 0;
> >>> +err_exit:
> >>> + ishtp_set_connection_state(ecl_ishtp_cl,
> >> ISHTP_CL_DISCONNECTING);
> >>> + ishtp_cl_disconnect(ecl_ishtp_cl);
> >>> + ecl_ishtp_cl_deinit(ecl_ishtp_cl);
> >>> +
> >>> + ishtp_put_device(cl_device);
> >>> +
> >>> + return rv;
> >>> +}
> >>> +
> >>> +static int ecl_ishtp_cl_remove(struct ishtp_cl_device *cl_device) {
> >>> + struct ishtp_cl *ecl_ishtp_cl = ishtp_get_drvdata(cl_device);
> >>> + struct ishtp_opregion_dev *opr_dev =
> >>> + ishtp_get_client_data(ecl_ishtp_cl);
> >>> +
> >>> + if (opr_dev->acpi_init_done)
> >>> + acpi_opregion_deinit(opr_dev);
> >>> +
> >>> + cancel_work_sync(&opr_dev->reset_work);
> >>> + cancel_work_sync(&opr_dev->event_work);
> >>> +
> >>> + ishtp_set_connection_state(ecl_ishtp_cl,
> >> ISHTP_CL_DISCONNECTING);
> >>> + ishtp_cl_disconnect(ecl_ishtp_cl);
> >>> + ecl_ishtp_cl_deinit(ecl_ishtp_cl);
> >>> +
> >>> + ishtp_put_device(cl_device);
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int ecl_ishtp_cl_reset(struct ishtp_cl_device *cl_device) {
> >>> + struct ishtp_cl *ecl_ishtp_cl = ishtp_get_drvdata(cl_device);
> >>> + struct ishtp_opregion_dev *opr_dev =
> >>> + ishtp_get_client_data(ecl_ishtp_cl);
> >>> +
> >>> + schedule_work(&opr_dev->reset_work);
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int ecl_ishtp_cl_suspend(struct device *device) {
> >>> + struct ishtp_cl_device *cl_device = ishtp_dev_to_cl_device(device);
> >>> + struct ishtp_cl *ecl_ishtp_cl = ishtp_get_drvdata(cl_device);
> >>> + struct ishtp_opregion_dev *opr_dev =
> >>> + ishtp_get_client_data(ecl_ishtp_cl);
> >>> +
> >>> + if (acpi_target_system_state() == ACPI_STATE_S0)
> >>> + return 0;
> >>> +
> >>> + acpi_opregion_deinit(opr_dev);
> >>> + ecl_ish_cl_enable_events(opr_dev, false);
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int ecl_ishtp_cl_resume(struct device *device) {
> >>> + /* A reset is expected to call after an Sx. At this point
> >>> + * we are not sure if the link is up or not to restore anything,
> >>> + * so do nothing in resume path
> >>> + */
> >>> + return 0;
> >>
> >> This seems very wrong, this means that there are no resume ordering
> >> guarantees meaning that drivers which are ordered to resume after
> >> this driver, because they rely on the opregion may end up not being
> >> able to use the opregion leading to all kind of issues.
> >>
> >> IMHO what should happen here is that this driver waits for the EClite
> >> to become ready here. Which probably means that it itself should be
> >> only resumed after the ISH HID driver is, but I assume that the ISH
> >> device is a parent of this device, so that ordering should be correct
> automatically.
> >
> >>
> >> TBH the whole lets just not resume and do a reset instead and then
> >> just tearing down the entire ISHTP client struct and setting up a new
> >> one from scratch, just feels very wrong. At a minimum the teardown +
> >> bringup needs to happen before the resume callback completes, but
> >> ideally this would be replaced by a much lighter resume solution.
> >
> > ISH Firmware (the PSE subsystem) can boot up/reinitialize Every Sx
> > based on usecase or sometimes PSE is always-on. So the resume path is
> > asynchronous and unpredictable in this case. Re-initialization and
> > clean up required if PSE also boot up every Sx and might take good
> > amount of time (Host can come alive before PSE comes up). Thus using
> asynchronous reset notification.
>
> Ok, so I guess we need to live with the ugly deregister + re-register OpRegion
> dance for the devices where the PSE is shutdown during suspend.
>
> You also write: "sometimes PSE is always-on", what about that case, I assume
> in this case there will be no reset after resume? So then unregistering the
> OpRegion handler at suspend (and setting link_ready=false) will be wrong
> since without the reset the OpRegion handler will never get reinstalled ?

reset is called from bus after every resume. When PSE is always-on, the notification
comes quicky, where as there reset comes bit later in case of new boot.

>
> Regards,
>
> Hans

Thanks again for the comments. Let me know if you have further review comments. I'll
work on v2.