Re: [RFC PATCH 0/4] tracing/probes: uaccess: Add support user-space access

From: Joel Fernandes
Date: Wed Feb 27 2019 - 16:34:05 EST


On Wed, Feb 27, 2019 at 04:41:04PM +0900, Masami Hiramatsu wrote:
> On Tue, 26 Feb 2019 16:38:50 -0500
> Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> > On Mon, Feb 25, 2019 at 11:04:42PM +0900, Masami Hiramatsu wrote:
>
> > > Note that kprobe event provides these methods, but it doesn't
> > > change it from kernel to user automatically because we do not
> > > know whether the given address is in userspace or kernel on
> > > some arch.
> > > Moreover, from perf-probe, at this moment it is not able to
> > > switch. Since __user is not for compiler but checker, we have
> > > no clue which data structure is in user-space, in debuginfo.
> > >
> > > BTW, according to Linus's comment, I implemented probe_user_read()
> > > and strncpy_from_unsafe_user() APIs. And since those use
> > > "access_ok()" inside it, if CONFIG_DEBUG_ATOMIC_SLEEP=y on x86,
> > > it will get a warn message at once. It should be solved before
> > > merging this series.
> >
> > I was wondering why access_ok() can sleep. In the arm64 and x86
> > implementation, I don't see access_ok() itself causing a user pointer
> > dereference access that can cause a page fault. It seems to just be checking
> > the validity of the ranges.
> >
> > Any idea why the access_ok() code has these comments?
> > "Context: User context only. This function may sleep if pagefaults are
> > enabled."
>
> Because access_ok() is used only for preparing accessing user-space,
> and the user-space access may cause page-fault and sleep.

Pedantically speaking, the access_ok() function itself in x86 implementation
does not sleep so the comment on the function saying "This function may
sleep" is odd to the code reader (and it confused someone else too so I'm not
the only one :)), but it could be that for some architectures it does sleep...

> IMHO, checking in access_ok() inside it is reasonable, but as it
> commented, it is for "if pagefaults are enabled.". What we need another
> access_ok() for the case when pagefaults are disabled, that is what PeterZ
> suggested in below mail.
>
> https://lore.kernel.org/lkml/20190225150603.GE32494@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#u

Makes sense, thanks.

- Joel


>
>
> Thank you,
>
> >
> > My _guess_ is this is because whatever calls access_ok() may also call
> > something else that *does* fault next, if that's the case then that
> > WARN_ON_IN_IRQ() in access_ok() is fine, but at least I guess the comments
> > should be more clear that it is not access_ok() itself that sleeps.
> >
> > thanks for any help on understanding this,
> >
> > - Joel
> >
> >
> > >
> > > Thank you,
> > >
> > > ---
> > >
> > > Masami Hiramatsu (4):
> > > uaccess: Make sure kernel_uaccess_faults_ok is updated before pagefault
> > > uaccess: Add non-pagefault user-space read functions
> > > tracing/probe: Add ustring type for user-space string
> > > tracing/probe: Support user-space dereference
> > >
> > >
> > > Documentation/trace/kprobetrace.rst | 13 ++-
> > > Documentation/trace/uprobetracer.rst | 9 +-
> > > fs/namespace.c | 2
> > > include/linux/uaccess.h | 13 +++
> > > kernel/trace/trace.c | 7 +-
> > > kernel/trace/trace_kprobe.c | 65 ++++++++++++++++
> > > kernel/trace/trace_probe.c | 39 ++++++++--
> > > kernel/trace/trace_probe.h | 3 +
> > > kernel/trace/trace_probe_tmpl.h | 36 +++++++--
> > > kernel/trace/trace_uprobe.c | 19 +++++
> > > mm/maccess.c | 138 ++++++++++++++++++++++++++++++----
> > > 11 files changed, 302 insertions(+), 42 deletions(-)
> > >
> > > --
> > > Masami Hiramatsu (Linaro) <mhiramat@xxxxxxxxxx>
>
>
> --
> Masami Hiramatsu <mhiramat@xxxxxxxxxx>