I'm also not sure if the bug ever happens with preemption disabled.
Sasha, was that you who reported that you cannot reproduce it without
preemption? It strikes me that there's a race condition in
__cond_resched() wrt preemption, for example: we do
__preempt_count_add(PREEMPT_ACTIVE);
__schedule();
__preempt_count_sub(PREEMPT_ACTIVE);
and in between the __schedule() and __preempt_count_sub(), if an
interrupt comes in and wakes up some important process, it won't
reschedule (because preemption is active), but then we enable
preemption again and don't check whether we should reschedule (again),
and we just go on our merry ways.
Now, I don't see how that could really matter for a long time -
returning to user space will check need_resched, and sleeping will
obviously force a reschedule anyway, so these kinds of races should at
most delay things by just a tiny amount,