Re: [RFC PATCH v3 for 4.15 08/24] Provide cpu_opv system call

From: Andy Lutomirski
Date: Sat Nov 18 2017 - 16:10:22 EST


On Fri, Nov 17, 2017 at 12:07 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Fri, 17 Nov 2017, Andi Kleen wrote:
>> > The most straight forward is to have a mechanism which forces everything
>> > into the slow path in case of debugging, lack of progress, etc. The slow
>>
>> That's the abort address, right?
>
> Yes.
>
>> For the generic case the fall back path would require disabling preemption
>> unfortunately, for which we don't have a mechanism in user space.
>>
>> I think that is what Mathieu tried to implement here with this call.
>
> Yes. preempt disabled execution of byte code to make sure that the
> transaction succeeds.
>
> But, why is disabling preemption mandatory? If stuff fails due to hitting a
> breakpoint or because it retried a gazillion times without progress, then
> the abort code can detect that and act accordingly. Pseudo code:
>
> abort:
> if (!slowpath_required() &&
> !breakpoint_caused_abort() &&
> !stall_detected()) {
> do_the_normal_abort_postprocessing();
> goto retry;
> }
>
> lock(slowpath_lock[cpu]);
>
> if (!slowpath_required()) {
> unlock(slowpath_lock[cpu]);
> goto retry;
> }
>
> if (rseq_supported)
> set_slow_path();
>
> /* Same code as inside the actual rseq */
> do_transaction();
>
> if (rseq_supported)
> unset_slow_path();
>
> unlock(slowpath_lock[cpu]);

My objection to this approach is that people will get it wrong and not
notice until it's too late. TSX has two things going for it:

1. It's part of the ISA, so debuggers have very well-defined semantics
to deal with and debuggers will know about it. rseq is a made-up
Linux thing and debuggers may not know what to do with it.

2. TSX is slow and crappy, so it may not be that widely used. glibc,
OTOH, will probably start using rseq on all machines if the patches
are merged.