Re: [PATCH v4] [RFC] trace: Add kprobe on tracepoint

From: Masami Hiramatsu
Date: Thu Aug 12 2021 - 05:44:34 EST


On Wed, 11 Aug 2021 23:46:48 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> On Thu, 12 Aug 2021 10:27:35 +0900
> Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> > Let me confirm this, so eprobes can be attached to synthetic event?
> > IMHO, I rather like to prevent attaching eprobe_event on the other
> > dynamic events. It makes hard to check when removing the base dynamic
> > events...
> >
> > For the above example, we can rewrite it as below to trace filename
> > without attaching eprobe_events on the synthetic event.
> >
> > echo 'my_open pid_t pid; char file[]' > synthetic_events
> >
> > echo 'e:myopen syscalls.sys_enter_open file=+0($filename):ustring' > dynamic_events
> > echo 'e:myopen_ret syscalls.sys_exit_open ret=$ret' > dynamic_events
> >
> > echo 'hist:keys=common_pid:fname=file' > events/eprobes/myopen/trigger
> > echo 'hist:keys=common_pid:fname=$fname:onmatch(eprobes.myopen).trace(my_open,common_pid,$fname)' > events/eprobes/myopen_ret
> >
>
> The problem is that the above wont work :-(
>
> For example, I can use this program:
>
> #include <stdio.h>
> #include <unistd.h>
> #include <fcntl.h>
> #include <sys/types.h>
>
> static const char *file = "/etc/passwd";
>
> int main (int argc, char **argv)
> {
> int fd;
>
> fd = open(file, O_RDONLY);
> if (fd < 0)
> perror(file);
> close(fd);
> return 0;
> }
>
> Which if you do the above, all you'll get from the myopen is "(null)".
>
> That's because the "/etc/passwd" is not paged in at the start of the
> system call, and because tracepoints can not fault, the "ustring" will
> not be mapped yet, it can not give you the content of the file pointer.
> This was the entire reason we are working on eprobes to attach to
> synthetic events in the first place.

I think that is another limitation. If you run this program,

static const char *file = "/etc/passwd";

int main (int argc, char **argv)
{
char buf[BUFSIZE];
int fd;

strlcpy(buf, file, BUFSIZE);
fd = open(buf, O_RDONLY);
if (fd < 0)
perror(file);
read(fd, buf, BUFSIZE);
close(fd);
return 0;
}

you'll not see any filename from the "myopen_ret" or the synthetic event.
Thus, the user-space page fault must be handled by the other way. (e.g.
making a special worker thread and run it before the task returns to
user space.)
Using eprobe over synthetic event does not solve the root cause (and
it can introduce another issue.)

Thank you,

>
> The trick is to use the synthetic event to pass the filename pointer to
> the exit of the system call, which the system call itself would map the
> pointer to "file", and when the eprobe reads it with ":ustring" from
> the exit of the system call it gets "/etc/passwd" instead of "(null)".
>
> Your above example doesn't fix this.
>
> -- Steve


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>