Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

From: Masami Hiramatsu
Date: Mon Feb 25 2019 - 03:09:53 EST


On Sun, 24 Feb 2019 20:49:45 -0800
Andy Lutomirski <luto@xxxxxxxxxx> wrote:

> On Sun, Feb 24, 2019 at 6:40 PM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> >
> > On Sun, 24 Feb 2019 09:26:45 -0800
> > Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Sun, Feb 24, 2019 at 7:18 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > > >
> > > > On Sat, 23 Feb 2019 20:38:03 -0800
> > > > Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > > > >
> > > > > Can we just get rid of this might_sleep()? access_ok() doesn't sleep
> > > > > as far as I know.
> > > >
> > > > Hmm, which might_sleep() would you pointed? What I talked was a
> > > > WARN_ON_ONCE(!in_task()) in access_ok() on x86 (only!), and in_task() just
> > > > checks preempt_count.
> > >
> > > So the in_task() check does kind of make sense. Using "access_ok()"
> > > outside of task context is certainly an odd thing, for several
> > > reasons. The main one being simply that outside of task context, the
> > > whole "which task" question is open, and you don't know if the task is
> > > the active one, and so it's not clear if whatever task you interrupt
> > > might have done "set_fs()" or not.
> >
> > Ah I got it. Usual case access_ok() in IRQ handler is strange.
> >
> > >
> > > So PeterZ isn't wrong:
> > >
> > > > I guess PeterZ assumed that access_ok() is used only with user space access
> > > > APIs (e.g. copy_from_user) which can cause page-fault and locks mm (and might
> > > > sleep :)), but now we are trying to use access_ok() with new functions which
> > > > disables page-fault and just return -EFAULT.
> > >
> > > .. but in this case, if we do it all *within* code that saves and
> > > restores the user access flag with get_fs/set_fs, access_ok() would be
> > > ok and it doesn't have the above issue.
> > >
> > > So access_ok() in _general_ is absolutely not safe to do from
> > > interrupts, but within the context of probing user memory from a
> > > tracing event it just happens to be ok.
> >
> > Hmm, but user can specify user-memory access from the tracing event
> > which is located in interrupt handler. So I understand that it is safe
> > only if we correctly setup access flag with get_fs/set_fs, is that
> > correct?
> >
> > > It would be lovely to have a special macro for this, and keep the
> > > warning for the general case, but because this is a "every
> > > architecture needs to build their own" it's probably too painful.
> >
> > Agreed.
>
> This should probably go with whatever effort makes nmi_uaccess_ok()
> available on all architectures. That being said, how about just
> making copy_from_user_nmi() work on all architectures, even if it just
> fails unconditionally on some of them?

I think even if we have copy_from_user_nmi(), we need something like
nmi_uaccess_ok() because without it we can not correctly use
__copy_from_user_inatomic()...


Thank you,

--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>