But you are right, there are two different scenarios:
1) thread already in another wake_q, wakeup happens immediately after the cmpxchg_relaxed().
This scenario is safe, due to the smp_mb__before_atomic() in wake_q_add()
2) thread woken up but e.g. a timeout, see ->state=STATE_READY, returns to user space, calls sys_exit.
This must not happen before get_task_struct acquired a reference.
And this appears to be unsafe: get_task_struct() is refcount_inc(), which is refcount_inc_checked(), which is according to lib/refcount.c fully unordered.
Thus: ->state=STATE_READY can execute before the refcount increase.
Thus: ->state=STATE_READY needs a smp_store_release(), correct?