Yes, this will require some refactoring and one potential way is to make
mpol ref counting unconditional. The conditional ref. counting has
already caused issues in the past and the code is rather hard to follow
anyway. I am not really sure this optimization is worth it.
Another option would be to block the pidfd side of things on completion
which would wake it up from the task_work context but I would rather
explore the ref counting approach first and only if this is proven to be
too expensive to go with hacks like this.