Re: [PATCH] sched: schedule_raw_spin_unlock() and schedule_spin_unlock()
From: Kirill Tkhai
Date: Mon Jun 17 2013 - 12:18:51 EST
17.06.2013, 18:29, "Steven Rostedt" <rostedt@xxxxxxxxxxx>:
> On Fri, 2013-06-14 at 18:40 +0400, Kirill Tkhai wrote:
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 58453b8..381e493 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -3125,6 +3125,30 @@ asmlinkage void __sched preempt_schedule_irq(void)
>> exception_exit(prev_state);
>> }
>>
>> +/*
>> + * schedule_raw_spin_unlock - unlock raw_spinlock and call schedule()
>> + *
>> + * Should be used instead of the constructions
>> + * 1) raw_spin_unlock_irq(lock);
>> + * schedule();
>> + * or
>> + * 2) raw_spin_unlock_irqrestore(lock, flags);
>> + * schedule();
>
> Is there a place that does #2? If interrupts were disabled, the flags
> would keep them disabled and that would not be good when calling
> schedule.
Some drivers use _irqsave even if they are certainly enabled, I was thinking
about them. This is mistake, but people use.
Now I'm agree with you, I'll remove advice #2 to not multiply this errors.
>> + * where they have to be.
>> + *
>> + * It helps to prevent excess preempt_schedule() during the unlocking,
>> + * which can be called on preemptible kernel.
>> + * Returns with irqs enabled.
>> + */
>> +void __sched schedule_raw_spin_unlock(raw_spinlock_t *lock)
>> +{
>
> I agree with the idea of adding this, but I don't like this
> implementation. Also, if this is to enable interrupts, the name must
> represent that:
>
> schedule_raw_spin_unlock_irq()
>
> You can't just enable them if they were not disabled. That will break
> things like lockdep.
Ok, and in addition may be useful schedule_raw_spin_unlock() for
use in cases like this happens in fs/*
>> + preempt_disable();
>> + raw_spin_unlock_irq(lock);
>> + sched_preempt_enable_no_resched();
>
> This is the easy way of implementing this, but it does add a slight
> overhead here. Adding and subtracting preempt count just to prevent the
> disable does add a bit more computation. I've done this in the tracing
> code, but its within the tracing and in an unlikely path. The overhead
> is not common. But I can see this being in a fast path and not something
> that we want to add overhead to.
>
> The ideal solution is not the easy one. It is to introduce a real
> schedule_raw_spin_unlock() that is a copy of raw_spin_unlock() that does
> not call preempt_enable, but calls preempt_enable_no_resched() instead,
> and then does the schedule.
Ok
> -- Steve
>
>> + schedule();
>> +}
>> +EXPORT_SYMBOL(schedule_raw_spin_unlock);
>> +
>> #endif /* CONFIG_PREEMPT */
>>
>> int default_wake_function(wait_queue_t *curr, unsigned mode, int wake_flags,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/