Re: [PATCH v3 5/5] x86/entry/64: Bypass enter_from_user_mode on non-context-tracking boots

From: Andy Lutomirski
Date: Mon Nov 16 2015 - 18:57:31 EST


On Mon, Nov 16, 2015 at 2:50 PM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> On Mon, Nov 16, 2015 at 11:10:55AM -0800, Andy Lutomirski wrote:
>> On Nov 13, 2015 7:26 AM, "Frederic Weisbecker" <fweisbec@xxxxxxxxx> wrote:
>> >
>> > On Thu, Nov 12, 2015 at 12:59:04PM -0800, Andy Lutomirski wrote:
>> > > On CONFIG_CONTEXT_TRACKING kernels that have context tracking
>> > > disabled at runtime (which includes most distro kernels), we still
>> > > have the overhead of a call to enter_from_user_mode in interrupt and
>> > > exception entries.
>> > >
>> > > If jump labels are available, this uses the jump label
>> > > infrastructure to skip the call.
>> >
>> > Looks good. But why are we still calling context tracking code on IRQs at all?
>>
>> Same reasons as before:
>>
>> 1. This way the IRQ exit path is almost completely shared with all the
>> other exit paths.
>
> I'm all for consolidation in general. Unless it brings bad middle states.

The middle state works fine, though. With these patches, the middle
state should have essentially no performance hit compared to the
previous state in default configurations.

>
> If I knew before that I would have to argue endlessly in order to protest against
> these context tracking changes, I would have NACK'ed the x86 consolidation rework in
> the state it was while it got merged.
>
>>
>> 2. It combines the checks for which context we were in with what CPL
>> we entered from.
>>
>> Part 2 should be complete across the whole x86 kernel soon once the
>> 64-bit syscall code gets fixed up.
>>
>> We should get rid of the duplication in the irq entry hooks. Want to
>> help with that?
>
> Which one? The duplication against irq_enter() and irq_exit()?

Yes.

>
> I think that irq_exit() should be moved to the IRQ very end and perform the
> final signal/schedule/preempt_schedule_irq() loop. But it requires a bit of
> rework on all archs in order to do that. This could be done iteratively though.
>
>> Presumably we should do the massive remote polling speedup to the nohz code,
>
> Hmm, I don't get what you mean here.
>

Currently (4.4-rc1), when an IRQ hits user mode, here's roughly what we do:

- Tell context tracking that we're in the kernel
- Switch ct state
- Wake up RCU
- Adjust vtime
- irq_enter
- Adjust preempt count
- Wake up RCU
- Tell vtime accounting that we're in an IRQ

All of the initial stuff should be, in the long term, just a write to
some variable and a possible barrier. Whatever CPU is doing
housekeeping can poll to keep track of user vs system time. The
irq_enter stuff, in turn, could either set some variable telling the
housekeeper that we're in an IRQ or it could continue to directly
adjust time accounting.

In any event, all of this should be extremely fast, which it currently isn't.

>> and we should also teach enter_from_user_mode to transition directly to IRQ state as
>> appropriate. Then irq_enter can be much faster.
>
> I don't get what you mean here either. You mean calling irq_enter() from enter_from_user_mode()?
>

No, I mean teaching irq_enter that, on x86 at least, we promise that
irq_enter is only ever called from CONTEXT_KERNEL so it can do less
redundant work.

Or, even better, we could fold the irq_enter and user->kernel hooks
into a single context tracking call.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/