Re: [PATCH 5.15 00/23] 5.15.160-rc1 review

From: Chuck Lever III
Date: Tue May 28 2024 - 19:34:48 EST




> On May 28, 2024, at 6:01 PM, NeilBrown <neilb@xxxxxxx> wrote:
>
> On Wed, 29 May 2024, Chuck Lever III wrote:
>>
>>
>>> On May 28, 2024, at 10:18 AM, Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
>>>
>>>
>>> On 28/05/2024 14:14, Chuck Lever III wrote:
>>>>> On May 28, 2024, at 5:04 AM, Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
>>>>>
>>>>>
>>>>> On 25/05/2024 15:20, Greg Kroah-Hartman wrote:
>>>>>> On Sat, May 25, 2024 at 12:13:28AM +0100, Jon Hunter wrote:
>>>>>>> Hi Greg,
>>>>>>>
>>>>>>> On 23/05/2024 14:12, Greg Kroah-Hartman wrote:
>>>>>>>> This is the start of the stable review cycle for the 5.15.160 release.
>>>>>>>> There are 23 patches in this series, all will be posted as a response
>>>>>>>> to this one. If anyone has any issues with these being applied, please
>>>>>>>> let me know.
>>>>>>>>
>>>>>>>> Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
>>>>>>>> Anything received after that time might be too late.
>>>>>>>>
>>>>>>>> The whole patch series can be found in one patch at:
>>>>>>>> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.160-rc1.gz
>>>>>>>> or in the git tree and branch at:
>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
>>>>>>>> and the diffstat can be found below.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> greg k-h
>>>>>>>>
>>>>>>>> -------------
>>>>>>>> Pseudo-Shortlog of commits:
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>> NeilBrown <neilb@xxxxxxx>
>>>>>>>> nfsd: don't allow nfsd threads to be signalled.
>>>>>>>
>>>>>>>
>>>>>>> I am seeing a suspend regression on a couple boards and bisect is pointing
>>>>>>> to the above commit. Reverting this commit does fix the issue.
>>>>>> Ugh, that fixes the report from others. Can you cc: everyone on that
>>>>>> and figure out what is going on, as this keeps going back and forth...
>>>>>
>>>>>
>>>>> Adding Chuck, Neil and Chris from the bug report here [0].
>>>>>
>>>>> With the above applied to v5.15.y, I am seeing suspend on 2 of our boards fail. These boards are using NFS and on entry to suspend I am now seeing ...
>>>>>
>>>>> Freezing of tasks failed after 20.002 seconds (1 tasks refusing to
>>>>> freeze, wq_busy=0):
>>>>>
>>>>> The boards appear to hang at that point. So may be something else missing?
>>>> Note that we don't have access to hardware like this, so
>>>> we haven't tested that patch (even the upstream version)
>>>> with suspend on that hardware.
>>>
>>>
>>> No problem, I would not expect you to have this particular hardware :-)
>>>
>>>> So, it could be something missing, or it could be that
>>>> patch has a problem.
>>>> It would help us to know if you observe the same issue
>>>> with an upstream kernel, if that is possible.
>>>
>>>
>>> I don't observe this with either mainline, -next or any other stable branch. So that would suggest that something else is missing from linux-5.15.y.
>>
>> That helps. It would be very helpful to have a reproducer I can
>> use to confirm we have a fix. I'm sure this will be a process
>> that involves a non-trivial number of iterations.
>
> Missing upstream patch is
>
> Commit 9bd4161c5917 ("SUNRPC: change service idle list to be an llist")
>
> This contains some freezer-related changes which probably should
> have been a separate patch.

Thanks for tracking that down.


> We probably just need to add "| TASK_FREEZABLE" in one or two places.
> I'll post a patch for testing in a little while.

My understanding is that the stable maintainers prefer a backport
of a patch (or patches) that are already applied to Linus' tree.


--
Chuck Lever