Currently, fair share from hctx_may_queue() requires two
atomic_read(active_queues and active_requests), I think this smoothing
method can be placed into get_tag fail path, for example, the more times
a disk failed to get tag in a period of time, the more tag this disk can
get, and all the information can be updated here(perhaps directly
record how many tags a disk can get, then hctx_may_queue() still only
require 2 atomic_read()).