[RFC] Kernel access to Ftrace instances.

From: Divya Indi
Date: Fri Mar 15 2019 - 19:36:14 EST


[PATCH] tracing: Kernel access to ftrace instances.

Please review the patch that follows. Below are some details providing
the goal and justification for the patch.

=======================================================================

Goal:

Ftrace provides the feature âinstancesâ that provides the capability to
create multiple Ftrace ring buffers. However, currently these buffers
are created/accessed via userspace only. The kernel APIs providing these
features are not exported, hence cannot be used by other kernel
components. We want to extend this infrastructure to provide the
flexibility to create/log/remove/ enable-disable existing trace events
to these buffers from within the kernel.


Justification:

1. We need module-specific/use-case specific ring buffers (apart
from the global trace buffer) to avoid overwrite by other components.
Hence, the need to use Ftrace "instances".

2. Flexibility to add additional logging to these module-specific
buffers via ksplice/live patch - Having a trace_printk counterpart for
these additional ring buffers.

3. Most often certain issues and events can be best monitored
within kernel.

4. Time sensitivity - We need the capability to dynamically enable
and disable tracing from within kernel to extract relevant debugging
info for the right time-window.


Example:

When the kernel detects an unexpected event such as connection drop (Eg:
RDS/NFS connection drops), we need the capability to enable specific
event tracing to capture relevant info during reconnect. This event
tracing will help us diagnose issues that occur during reconnect like
RCA longer reconnect times. In such cases we also want to disable the
tracing at the right moment and capture a snapshot from within kernel
to make sure we have the relevant diagnostics data and nothing is
overwritten or lost.


Note: The additional logging is not part of the kernel. We intend to
only provide the flexibility to add the logging as part of diagnostics
via ksplice/live-patch on need-basis.


Please find below the compilation of APIs to be introduced or exported as is.


We propose adding two new functions:

1. struct trace_array *trace_array_create(const char *name);
2. int trace_array_destroy(struct trace_array *tr);


In addition, we need to export functions:

3. int trace_array_printk(struct trace_array *tr, unsigned long ip,
const char *fmt, ...);
4. int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set);
5. void trace_printk_init_buffers(void);


To workaround the redundancy due to the newly introduced APIs, we propose the
following restructuring -

1. Move the contents of instance_mkdir to the new API.
static int instance_mkdir(const char *name)
{
return PTR_ERR_OR_ZERO(trace_array_create(name));
}

2. Introduce internal static function: __remove_instance(struct trace_array *tr)
This will be almost similar to old instance_rmdir which
identified the trace_array to be removed based on the name.

Modify existing API to use the internal function:
static int instance_rmdir(const char *name)
{
struct trace_array *tr;
int err = -ENODEV;

mutex_lock(&event_mutex);
mutex_lock(&trace_types_lock);

list_for_each_entry(tr, &ftrace_trace_arrays, list) {
if (tr->name && strcmp(tr->name, name) == 0) {
err = __remove_instance(tr);
break;
}
}

mutex_unlock(&trace_types_lock);
mutex_unlock(&event_mutex);

return err;
}

New API to be exported:
int trace_array_destroy(struct trace_array *tr)
{
int err;

mutex_lock(&event_mutex);
mutex_lock(&trace_types_lock);
err = __remove_instance(tr);
mutex_unlock(&trace_types_lock);
mutex_unlock(&event_mutex);

return err;
}

====================================================================================

Thanks,
Divya