Rather than adjusting the bitmap index, what about simply umping the bitmap size?
IIRC, current CPUs have 512 ASIDs, counting ASID 0, i.e. bumping the size won't
consume any additional memory. And if it does, the cost is 8 bytes...
It'd be a bigger refactoring, but it should completely eliminate the mod-by-1
shenanigans, e.g. a partial patch could look like