Re: [RFC PATCH 1/6] x86/alternative: assert text_mutex is taken
From: Masami Hiramatsu
Date: Wed Aug 29 2018 - 22:27:22 EST
On Wed, 29 Aug 2018 14:00:06 -0700
Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> On Wed, Aug 29, 2018 at 08:44:47PM +0000, Nadav Amit wrote:
> > at 1:13 PM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> >
> > > On Wed, Aug 29, 2018 at 07:36:22PM +0000, Nadav Amit wrote:
> > >> at 10:11 AM, Nadav Amit <namit@xxxxxxxxxx> wrote:
> > >>
> > >>> at 1:59 AM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> > >>>
> > >>>> On Wed, 29 Aug 2018 01:11:42 -0700
> > >>>> Nadav Amit <namit@xxxxxxxxxx> wrote:
> > >>>>
> > >>>>> Use lockdep to ensure that text_mutex is taken when text_poke() is
> > >>>>> called.
> > >>>>>
> > >>>>> Actually it is not always taken, specifically when it is called by kgdb,
> > >>>>> so take the lock in these cases.
> > >>>>
> > >>>> Can we really take a mutex in kgdb context?
> > >>>>
> > >>>> kgdb_arch_remove_breakpoint
> > >>>> <- dbg_deactivate_sw_breakpoints
> > >>>> <- kgdb_reenter_check
> > >>>> <- kgdb_handle_exception
> > >>>> <- __kgdb_notify
> > >>>> <- kgdb_ll_trap
> > >>>> <- do_int3
> > >>>> <- kgdb_notify
> > >>>> <- die notifier
> > >>>>
> > >>>> kgdb_arch_set_breakpoint
> > >>>> <- dbg_activate_sw_breakpoints
> > >>>> <- kgdb_reenter_check
> > >>>> <- kgdb_handle_exception
> > >>>> ...
> > >>>>
> > >>>> Both seems called in exception context, so we can not take a mutex lock.
> > >>>> I think kgdb needs a special path.
> > >>>
> > >>> You are correct, but I don’t want a special path. Presumably text_mutex is
> > >>> guaranteed not to be taken according to the code.
> > >>>
> > >>> So I guess the only concern is lockdep. Do you see any problem if I change
> > >>> mutex_lock() into mutex_trylock()? It should always succeed, and I can add a
> > >>> warning and a failure path if it fails for some reason.
> > >>
> > >> Err.. This will not work. I think I will drop this patch, since I cannot
> > >> find a proper yet simple assertion. Creating special path just for the
> > >> assertion seems wrong.
> > >
> > > It's probably worth expanding the comment for text_poke() to call out
> > > the kgdb case and reference kgdb_arch_{set,remove}_breakpoint(), whose
> > > code and comments make it explicitly clear why its safe for them to
> > > call text_poke() without acquiring the lock. Might prevent someone
> > > from going down this path again in the future.
> >
> > I thought that the whole point of the patch was to avoid comments, and
> > instead enforce the right behavior. I don’t understand well enough kgdb
> > code, so I cannot attest it does the right thing. What happens if
> > kgdb_do_roundup==0?
>
> As is, the comment is wrong because there are obviously cases where
> text_poke() is called without text_mutex being held. I can't attest
> to the kgdb code either. My thought was to document the exception so
> that if someone does want to try and enforce the right behavior they
> can dive right into the problem instead of having to learn of the kgdb
> gotcha the hard way. Maybe a FIXME is the right approach?
No, kgdb ensures that the text_mutex has not been held right before
calling text_poke. So they also take care the text_mutex. I guess
kgdb_arch_{set,remove}_breakpoint() is supposed to be run under
a special circumstance, like stopping all other threads/cores.
In that case, we can just check the text_mutex is not locked.
Anyway, kgdb is a very rare courner case. I think if CONFIG_KGDB is
enabled, lockdep and any assertion should be disabled, since kgdb
can tweak anything in the kernel with unexpected ways...
Thank you,
--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>