Re: [Intel-wired-lan] [PATCH] e1000e: bump up timeout to wait when ME un-configure ULP mode
From: Kai-Heng Feng
Date: Thu Mar 26 2020 - 07:30:09 EST
Hi Paul,
> On Mar 25, 2020, at 23:49, Paul Menzel <pmenzel@xxxxxxxxxxxxx> wrote:
>
> Dear Linux folks,
>
>
> Am 25.03.20 um 14:58 schrieb Neftin, Sasha:
>> On 3/25/2020 08:43, Aaron Ma wrote:
>
>>> On 3/25/20 2:36 PM, Neftin, Sasha wrote:
>>>> On 3/25/2020 06:17, Kai-Heng Feng wrote:
>
>>>>>> On Mar 24, 2020, at 03:16, Aaron Ma <aaron.ma@xxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> ME takes 2+ seconds to un-configure ULP mode done after resume
>>>>>> from s2idle on some ThinkPad laptops.
>>>>>> Without enough wait, reset and re-init will fail with error.
>>>>>
>>>>> Thanks, this patch solves the issue. We can drop the DMI quirk in
>>>>> favor of this patch.
>>>>>
>>>>>> Fixes: f15bb6dde738cc8fa0 ("e1000e: Add support for S0ix")
>>>>>> BugLink: https://bugs.launchpad.net/bugs/1865570
>>>>>> Signed-off-by: Aaron Ma <aaron.ma@xxxxxxxxxxxxx>
>>>>>> ---
>>>>>> drivers/net/ethernet/intel/e1000e/ich8lan.c | 4 ++--
>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>> b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>> index b4135c50e905..147b15a2f8b3 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>> @@ -1240,9 +1240,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct
>>>>>> e1000_hw *hw, bool force)
>>>>>> ew32(H2ME, mac_reg);
>>>>>> }
>>>>>>
>>>>>> - /* Poll up to 300msec for ME to clear ULP_CFG_DONE. */
>>>>>> + /* Poll up to 2.5sec for ME to clear ULP_CFG_DONE. */
>>>>>> while (er32(FWSM) & E1000_FWSM_ULP_CFG_DONE) {
>>>>>> - if (i++ == 30) {
>>>>>> + if (i++ == 250) {
>>>>>> ret_val = -E1000_ERR_PHY;
>>>>>> goto out;
>>>>>> }
>>>>>
>>>>> The return value was not caught by the caller, so the error ends up
>>>>> unnoticed.
>>>>> Maybe let the caller check the return value of
>>>>> e1000_disable_ulp_lpt_lp()?
>
>>>> I a bit confused. In our previous conversation you told ME not running.
>>>> let me shimming in. Increasing delay won't be solve the problem and just
>>>> mask it. We need to understand why ME take too much time. What is
>>>> problem with this specific system?
>>>> So, basically no ME system should works for you.
>>>
>>> Some laptops ME work that's why only reproduce issue on some laptops.
>>> In this issue i219 is controlled by ME.
>>
>> Who can explain - why ME required too much time on this system?
>> Probably need work with ME/BIOS vendor and understand it.
>> Delay will just mask the real problem - we need to find root cause.
>>> Quirk is only for 1 model type. But issue is reproduced by more models.
>>> So it won't work.
>
> (Where is Aaronâs reply? It wasnât delivered to me yet.)
>
> As this is clearly a regression, please revert the commit for now, and then find a way to correctly implement S0ix support. Linuxâ regression policy demands that as no fix has been found since v5.5-rc1. Changing Intel ME settings, even if it would work around the issue, is not an acceptable solution. Delaying the resume time is also not a solution.
The s0ix patch can reduce power consumption, finally makes S2idle an acceptable sleep method.
So I'd say it's a fix, albeit a regression was introduced.
>
> Regarding Intel Management Engine, only Intel knows what it does and what the error is, as the ME firmware is proprietary and closed.
>
> Lastly, there is no way to fully disable the Intel Management Engine. The HAP stuff claims to stop the Intel ME execution, but nobody knows, if itâs successful.
>
> Aaron, Kai-Hang, can you send the revert?
I consider that as an important fix for s2idle, I don't think reverting is appropriate.
Kai-Heng
>
>
> Kind regards,
>
> Paul
>
>