Re: [GIT PULL v6] hw-breakpoints: Rewrite on top of perf events v6

From: Frederic Weisbecker
Date: Tue Dec 01 2009 - 01:43:14 EST


On Sat, Nov 28, 2009 at 12:37:05AM +0530, K.Prasad wrote:
> I think the register_<> interfaces can become wrappers around functions
> that do the following:
>
> - arch_validate(): Validate request by invoking an arch-dependant
> routine. Proceed if returned valid.
> - arch-specific debugreg availability: Do something like
> if (arch_hw_breakpoint_availabile())
> bp = perf_event_create_kernel_counter();


The current state is settled for Bp Api clients
(perf_event_create_kernel_counter()) and perf clients (perf syscall)
to have the same endpoint, which is the arch_validate() + reg reservation.
This is already what is done. It's just done at the pmu level.

I don't understand your point.


> perf_event_create_kernel_counter()--->arch_install_hw_breakpoint();


But this is what is done, when perf_event_alloc() get the breakpoint
pmu.


> This way, all book-keeping related work (no. of pinned/flexible/per-cpu)
> will be moved to arch-specific files (will be helpful for PPC Book-E
> implementation having two types of debug registers). Every new
> architecture that intends to port to the new hw-breakpoint
> implementation must define their arch_validate(),
> arch_hw_breakpoint_available() and an arch_install_hw_breakpoint(),
> while the hw-breakpoint code will be flexible enough to extend itself to
> each of these archs.


We just need to move reserve_bp_slot() and release_bp_slot() in arch code
then.

But I would prefer to move more efforts in generalizing what can
generalized in the register reservation topic, and have the smallest
possible part in arch code.


> This implementation would be even superior (in terms of extensibility)
> to even the older hw-breakpoint layer implementation (despite it providing
> a working layer for x86 and PPC64).


The older breakpoint API was very tight to x86. It had a single linear
breakpoint refcounting that wasn't handling different natures of breakpoint
registers (different registers between instruction and data).
Neither was this linear refcounting handling the fact a single cpu could
share a single breakpoint register between hardware threads.

So, no I don't think it was providing a working layer for PPC64.

That said the current state of the constraints sucks :) as it is too much
tight to x86 too.

But I want to avoid the trap of moving all the constraint checks
to the arch. If possible, I would like to extract the arch specificity
only.
It's possible that in the case of PPC64, we want a totally different
constraint check, but what about other archs? Do they have very close
needs than x86 or another bunck of tricky constraints? In the former
case, I would prefer to have the current constraints defined as
__weak and let the tricky archs implement their own constraints.
That will only work if the _tricky_ arch are a minority of course.


> > > As pointed out in 20091111130207.GA5676@xxxxxxxxxx and
> > > 20091112042502.GA3145@xxxxxxxxxx, ptrace requests can a) lose register
> > > slots when modifying the breakpoint addresses and b) new implementation
> > > assumes that every DR7 write to be preceded by a write on DR0-DR3 which
> > > need not be true.
> >
> > The a) case is going to be fixed.
> > But the b) situation must be reported as a user mistake (which is what is
> > done currently): -EINVAL, -EIO or whatever. Enabling a breakpoint without
> > having given an address is a userland bug.
> >
>
> b) need not be a user mistake always (except perhaps the first time). As I
> mentioned here 20091112042502.GA3145@xxxxxxxxxx, DR7 enable/disable
> without a DR0-DR3 write can be done by the user through ptrace for
> optimising the number of write operations (and hence ptrace syscalls).


I really think this is a wrong workflow as the address register
is undefined.

I think this is a user bug. In the current upstream state, I guess
the addr debugreg are initialized to 0. So is this going to
set a breakpoint to 0?

I doubt there are much user app that rely on such buggy behaviour,
but I can probably support that by creating a temporary disabled
breakpoint in this case.



> Consider the following steps which is entirely valid (in mainline ptrace)
> but which would fail if assumed that a DR0-DR3 write precedes a DR7 write:
> i) Set address on DR0
> ii) Enable bits corresponding to DR0 in DR7
> iii) Disable DR0 bits in DR7
> iv) Re-enable DR0 bits in DR7


Agreed. I need to fix this. I thought you were talking about enabling
dr0 in dr7 without having ever set dr0.

Ok I'll fix this case.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/