Re: [PATCH] scripts/gdb: fix list_for_each
From: Kieran Bingham
Date: Wed Sep 23 2020 - 17:18:00 EST
On 23/09/2020 14:13, George Prekas wrote:
> Hi Kieran,
>
> On 9/22/2020 2:11 PM, Kieran Bingham wrote:
>> Hi George,
>>
>> On 22/09/2020 18:17, Prekas, George wrote:
>>>
>>> On 9/22/2020 9:32 AM, Jan Kiszka wrote:
>>>>
>>>> On 22.09.20 16:28, George Prekas wrote:
>>>>> If the next pointer is NULL, list_for_each gets stuck in an infinite
>>>>> loop.
>>>>>
>>>>> Signed-off-by: George Prekas <prekageo@xxxxxxxxxx>
>>>>> ---
>>>>> scripts/gdb/linux/lists.py | 2 ++
>>>>> 1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/scripts/gdb/linux/lists.py b/scripts/gdb/linux/lists.py
>>>>> index c487ddf09d38..424a91c1aa8b 100644
>>>>> --- a/scripts/gdb/linux/lists.py
>>>>> +++ b/scripts/gdb/linux/lists.py
>>>>> @@ -27,6 +27,8 @@ def list_for_each(head):
>>>>> raise TypeError("Must be struct list_head not {}"
>>>>> .format(head.type))
>>>>>
>>>>> + if head['next'] == 0:
>>>>> + return
>>>>> node = head['next'].dereference()
>>>>> while node.address != head.address:
>>>>> yield node.address
>>>>
>>>> Obviously, infinite loops are bad and should be avoided. But NULL is
>>>> bug, isn't it? Shouldn't we report such a corruption?
>>>>
>>>
>>> Hi Jan,
>>>
>>> Is it a bug? Or does it mean that the list is empty?
>>
>> A correctly initialised (empty) list_head has the next, and prev
>> pointers pointing to itself
>>
>
> You are right, actually.
>
>>
>>> Let me give some background. If you do the following:
>>>
>>> $ qemu-system-x86_64 -nographic -m 1024 -kernel
>>> build/arch/x86/boot/bzImage -s -S < /dev/null > /dev/null &
>>> $ gdb -q build/vmlinux -ex 'target remote localhost:1234' -iex 'set
>>> auto-load safe-path /' -ex 'lx-symbols'
>>
>> I suspect this is trying to load modules before the kernel is actually
>> fully loaded and running, so nothing is yet initialised.
>>
>>
>>> You will see:
>>>
>>> loading vmlinux
>>> scanning for modules in /home/ubuntu/linux-5.8.10
>>> no module object found for ''
>>>
>>> And the last line repeats forever. This happens because modules.next ==
>>> NULL. This is the Python stack trace:
>>>
>>>[...]
>>>
>>> This patch tries to fix the above problem.
>>
>> Does it fix it for you ?
>>
>> I expect it allows the boot process to continue, but the lx-symbols
>> command will not have completed successfully (or rather I expect it will
>> not have found anything to load).
>>
>> I suspect adding defensive checks in here might be helpful but I think
>> the reality is the code is being called at the wrong time.
>>
>> The fact that it 'can' be called at the wrong time is where we might
>> need to be more defensive.
>>
>
> At that point in time, the kernel has not even started so it does not
> have any loaded modules. In fact, as you said, the modules linked list
> is uninitialized. So with this patch, lx-symbols does not get stuck in
> an infinite loop and loads only the vmlinux symbols.
>
> Maybe, I should rephrase the commit message to say that list_for_each
> gets stuck in an infinite loop on uninitialized linked lists.
>
> Do you think that list_for_each should handle uninitialized lists? If
> yes, how do you propose to handle them?
>
> 1. Treat them as empty lists (this patch).
> 2. Print a warning and treat them as empty lists.
> 3. Raise exception and treat them as empty lists.
>
> I would go with option 1. For traversal purposes an uninitialized list
> is the same as an empty list; it has no elements. I am happy, though, to
> change the patch to another option if you believe it would be better.
I would choose 2 personally.
While debugging, if anyone hits an uninitialised linked-list - that's a
problem they want to know about, not ignore.
--
Kieran
> --
> George
>