Re: [RFC v3] debug: prevent entering debug mode on errors

From: Kiran Raparthy
Date: Mon Dec 08 2014 - 00:18:00 EST


Hi Jason,

On 1 December 2014 at 11:32, Kiran Raparthy <kiran.kumar@xxxxxxxxxx> wrote:
> Hi Jason,
>
> On 27 November 2014 at 15:19, Daniel Thompson
> <daniel.thompson@xxxxxxxxxx> wrote:
>> On 26/11/14 17:45, Colin Cross wrote:
>>> On Wed, Nov 26, 2014 at 1:14 AM, Kiran Raparthy <kiran.kumar@xxxxxxxxxx> wrote:
>>>> From: Colin Cross <ccross@xxxxxxxxxxx>
>>>>
>>>> debug: prevent entering debug mode on errors
>>>>
>>>> On non-developer devices kgdb prevents CONFIG_PANIC_TIMEOUT from rebooting the
>>>> device after a panic.
>>>>
>>>> In case of panics and exceptions, to honor CONFIG_PANIC_TIMEOUT, prevent
>>>> entering debug mode to avoid getting stuck waiting for the user to interact
>>>> with debugger.
>>>>
>>>> Cc: Jason Wessel <jason.wessel@xxxxxxxxxxxxx>
>>>> Cc: kgdb-bugreport@xxxxxxxxxxxxxxxxxxxxx
>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>>> Cc: Android Kernel Team <kernel-team@xxxxxxxxxxx>
>>>> Cc: John Stultz <john.stultz@xxxxxxxxxx>
>>>> Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx>
>>>> Signed-off-by: Colin Cross <ccross@xxxxxxxxxxx>
>>>> [Kiran: Added context to commit message.
>>>> panic_timeout is used instead of break_on_panic and
>>>> break_on_exception to honor CONFIG_PANIC_TIMEOUT]
>>>> Signed-off-by: Kiran Raparthy <kiran.kumar@xxxxxxxxxx>
>>>> ---
>>>> kernel/debug/debug_core.c | 17 +++++++++++++++++
>>>> 1 file changed, 17 insertions(+)
>>>>
>>>> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
>>>> index 1adf62b..0012a1f 100644
>>>> --- a/kernel/debug/debug_core.c
>>>> +++ b/kernel/debug/debug_core.c
>>>> @@ -689,6 +689,14 @@ kgdb_handle_exception(int evector, int signo, int ecode, struct pt_regs *regs)
>>>>
>>>> if (arch_kgdb_ops.enable_nmi)
>>>> arch_kgdb_ops.enable_nmi(0);
>>>> + /*
>>>> + * Avoid entering the debugger if we were triggered due to an oops
>>>> + * but panic_timeout indicates the system should automatically
>>>> + * reboot on panic. We don't want to get stuck waiting for input
>>>> + * on such systems, especially if its "just" an oops.
>>>> + */
>>>> + if (signo != SIGTRAP && panic_timeout)
>>>> + return 1;
>>>>
>>>> memset(ks, 0, sizeof(struct kgdb_state));
>>>> ks->cpu = raw_smp_processor_id();
>>>> @@ -821,6 +829,15 @@ static int kgdb_panic_event(struct notifier_block *self,
>>>> unsigned long val,
>>>> void *data)
>>>> {
>>>> + /*
>>>> + * Avoid entering the debugger if we were triggered due to a panic
>>>> + * We don't want to get stuck waiting for input from user in such case.
>>>> + * panic_timeout indicates the system should automatically
>>>> + * reboot on panic.
>>>> + */
>>>> + if (panic_timeout)
>>>> + return NOTIFY_DONE;
>>>> +
>>>> if (dbg_kdb_mode)
>>>> kdb_printf("PANIC: %s\n", (char *)data);
>>>> kgdb_breakpoint();
>>>
>>> The original patch was more useful as it allowed re-enabling break on
>>> panic on specific devices where you were trying to debug a
>>> reproducible issue. What about using a module_param similar to
>>> kgdbreboot, but setting the default based on CONFIG_PANIC_TIMEOUT to
>>> avoid extra configuration?
>>
>> This change was due to my review so perhaps I'd better answer this...
>>
>> panic_timeout is the value of the panic sysctl. In addition to the
>> normal sysctl tooling (which I don't think is available on most android
>> systems), its value can be set using panic=0 on the kernel command line
>> or via /proc/sys/kernel/panic at runtime.
>>
>> CONFIG_PANIC_TIMEOUT merely sets the default value of the sysctl. I
>> guess perhaps the patch description could be improved to make this clearer.
>>
>> Therefore, the only loss of function I expected versus the original is
>> that it would be hard to get as far as a reproducible panic if the
>> system also has a ton of reproducible oopses that we don't want to fix.
>> Is such a use-case important?
>
> Could you please let me know if this patch is good to move from RFC to PATCH?
Just a gentle reminder.
Regards,
Kiran

> Regards,
> Kiran
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/