Re: [PATCH v1 1/1] binder: transaction latency tracking for user build
From: Joel Fernandes
Date: Thu Feb 06 2020 - 22:18:04 EST
On Fri, Feb 07, 2020 at 11:10:23AM +0800, Frankie Chang wrote:
> On Wed, 2020-02-05 at 10:49 -0500, Joel Fernandes wrote:
> > On Wed, Feb 05, 2020 at 09:36:12AM +0000, Greg Kroah-Hartman wrote:
> > > On Wed, Feb 05, 2020 at 02:52:52PM +0800, Frankie Chang wrote:
> > > > Record start/end timestamp to binder transaction.
> > > > When transaction is completed or transaction is free,
> > > > it would be checked if transaction latency over threshold (2 sec),
> > > > if yes, printing related information for tracing.
> > > >
> > > > Signed-off-by: Frankie Chang <Frankie.Chang@xxxxxxxxxxxx>
> > > > ---
> > > > drivers/android/Kconfig | 8 +++
> > > > drivers/android/binder.c | 107 +++++++++++++++++++++++++++++++++++++
> > > > drivers/android/binder_internal.h | 4 ++
> > > > 3 files changed, 119 insertions(+)
> > > >
> > > > diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
> > > > index 6fdf2ab..7ba80eb 100644
> > > > --- a/drivers/android/Kconfig
> > > > +++ b/drivers/android/Kconfig
> > > > @@ -54,6 +54,14 @@ config ANDROID_BINDER_IPC_SELFTEST
> > > > exhaustively with combinations of various buffer sizes and
> > > > alignments.
> > > >
> > > > +config BINDER_USER_TRACKING
> > > > + bool "Android Binder transaction tracking"
> > > > + help
> > > > + Used for track abnormal binder transaction which is over 2 seconds,
> > > > + when the transaction is done or be free, this transaction would be
> > > > + checked whether it executed overtime.
> > > > + If yes, printing out the detail info about it.
> > > > +
> > > > endif # if ANDROID
> > > >
> > > > endmenu
> > > > diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> > > > index e9bc9fc..5a352ee 100644
> > > > --- a/drivers/android/binder.c
> > > > +++ b/drivers/android/binder.c
> > > > @@ -76,6 +76,11 @@
> > > > #include "binder_internal.h"
> > > > #include "binder_trace.h"
> > > >
> > > > +#ifdef CONFIG_BINDER_USER_TRACKING
> > > > +#include <linux/rtc.h>
> > > > +#include <linux/time.h>
> > > > +#endif
> > > > +
> > > > static HLIST_HEAD(binder_deferred_list);
> > > > static DEFINE_MUTEX(binder_deferred_lock);
> > > >
> > > > @@ -591,8 +596,104 @@ struct binder_transaction {
> > > > * during thread teardown
> > > > */
> > > > spinlock_t lock;
> > > > +#ifdef CONFIG_BINDER_USER_TRACKING
> > > > + struct timespec timestamp;
> > > > + struct timeval tv;
> > > > +#endif
> > > > };
> > > >
> > > > +#ifdef CONFIG_BINDER_USER_TRACKING
> > > > +
> > > > +/*
> > > > + * binder_print_delay - Output info of a delay transaction
> > > > + * @t: pointer to the over-time transaction
> > > > + */
> > > > +static void binder_print_delay(struct binder_transaction *t)
> > > > +{
> > > > + struct rtc_time tm;
> > > > + struct timespec *startime;
> > > > + struct timespec cur, sub_t;
> > > > +
> > > > + ktime_get_ts(&cur);
> > > > + startime = &t->timestamp;
> > > > + sub_t = timespec_sub(cur, *startime);
> > > > +
> > > > + /* if transaction time is over than 2 sec,
> > > > + * show timeout warning log.
> > > > + */
> > > > + if (sub_t.tv_sec < 2)
> > > > + return;
> > > > +
> > > > + rtc_time_to_tm(t->tv.tv_sec, &tm);
> > > > +
> > > > + spin_lock(&t->lock);
> > > > + pr_info_ratelimited("%d: from %d:%d to %d:%d",
> > > > + t->debug_id,
> > > > + t->from ? t->from->proc->pid : 0,
> > > > + t->from ? t->from->pid : 0,
> > > > + t->to_proc ? t->to_proc->pid : 0,
> > > > + t->to_thread ? t->to_thread->pid : 0);
> > > > + spin_unlock(&t->lock);
> > > > +
> > > > + pr_info_ratelimited(" total %u.%03ld s code %u start %lu.%03ld android %d-%02d-%02d %02d:%02d:%02d.%03lu\n",
> > > > + (unsigned int)sub_t.tv_sec,
> > > > + (sub_t.tv_nsec / NSEC_PER_MSEC),
> > > > + t->code,
> > > > + (unsigned long)startime->tv_sec,
> > > > + (startime->tv_nsec / NSEC_PER_MSEC),
> > > > + (tm.tm_year + 1900), (tm.tm_mon + 1), tm.tm_mday,
> > > > + tm.tm_hour, tm.tm_min, tm.tm_sec,
> > > > + (unsigned long)(t->tv.tv_usec / USEC_PER_MSEC));
> > > > +}
> > >
> > > Ick, why not use a tracepoint for this instead?
> > >
> > > And what is userspace supposed to do with this if they see it?
> >
> > Or another option is to implement this separately outside of binder.c using
> > register_trace_* on the existing binder tracepoints, similar to what say the
> > block tracer or preempt-off tracers do. Call it, say, "binder-latency tracer".
> >
> > That way all of this tracing code is in-kernel but outside of binder.c.
> >
> > thanks,
> >
> > - Joel
> >
> Time limitation of recording is the reason why we don't use tracepoint.
> In some situations, the exception is caused by a series of transactions
> interaction.
> Some abnormal transactions may be pending for a long time ago, they
> could not be recorded due to buffer limited.
register_trace_* does not use the trace buffer so I am not sure what you
mean. I am asking you to use tracepoints, not ftrace events.
thanks,
- Joel