Re: [RFC V2] kthread: add object debug support
From: Thomas Gleixner
Date: Wed Aug 12 2020 - 06:11:02 EST
Qianli,
Qianli Zhao <zhaoqianligood@xxxxxxxxx> writes:
> Add debugobject support to track the life time of kthread_work
> which is used to detect reinitialization/free active object problems
> Add kthread_init_work_onstack/kthread_init_delayed_work_onstack for
> kthread onstack support
s/kthread/kthread_work/ ?
It would also be nice to have an example output in the changelog.
> +static struct debug_obj_descr kwork_debug_descr = {
> + .name = "kthread_work",
> + .debug_hint = kwork_debug_hint,
> + .is_static_object = kwork_is_static_object,
Nitpick. You nicely aligned all the initializers except of this
one. Just add another TAB to all of them and this becomes a perfect
table.
> + .fixup_init = kwork_fixup_init,
> + .fixup_free = kwork_fixup_free,
This lacks:
.fixup_assert_init = ....
which catches cases where a non static object is used uninitialized.
> @@ -698,6 +786,7 @@ int kthread_worker_fn(void *worker_ptr)
> work = list_first_entry(&worker->work_list,
> struct kthread_work, node);
> list_del_init(&work->node);
> + debug_kwork_deactivate(work);
Please move that before the list del. Deactivate debug cannot do much
about the wreckage as there is no fixup function, but at least you get
the message printed _before_ the list delete can cause havoc and crashes
something else on a different CPU which makes the whole point of being
able to debug this kind of problems moot.
> @@ -835,8 +924,11 @@ static void kthread_insert_work(struct kthread_worker *worker,
>
> list_add_tail(&work->node, pos);
> work->worker = worker;
> - if (!worker->current_work && likely(worker->task))
> +
> + if (!worker->current_work && likely(worker->task)) {
> + debug_kwork_activate(work);
That's misplaced as well. The work is activated with list_add_tail() and
you really want to call debug_kwork_activate() unconditionally before
doing the list operation:
1) If the object is active or not initialized then the list operation
will cause memory corruption and the fixup function will operate on
already wreckaged state. Debug objects is about preventing the
wreckaged state and keeping the system alive so the debug info goes
out.
2) If the worker is busy (worker->current_worker != NULL) then the
newly queued work is not activated in the debug object tracker and
the deactivation will complain or a consequent free of the object
will fail to detect that it's still active.
> /**
> @@ -1054,6 +1146,7 @@ static bool __kthread_cancel_work(struct kthread_work *work, bool is_dwork,
> */
> if (!list_empty(&work->node)) {
> list_del_init(&work->node);
> + debug_kwork_deactivate(work);
See above.
> return true;
> }
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 9ad9210..8355984 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -540,6 +540,16 @@ config DEBUG_OBJECTS_WORK
> work queue routines to track the life time of work objects and
> validate the work operations.
>
> +config DEBUG_OBJECTS_KTHREAD
> + bool "Debug kthread work objects"
This is about debugging kthread_work, so can you please name the
config option accordingly? It's not about debugging KTHREAD itself.
Thanks,
tglx