I don't feel strongly either way, but current->memcg_nr_pages_over_high can
be very large for large allocations.
That said, maybe we should just reclaim `max(SWAP_CLUSTER_MAX, current -
high)` for each loop? I agree that with this design it looks like perhaps we
don't need it any more.
Johannes, what do you think?
How about this:
Reclaim memcg_nr_pages_over_high in the first iteration, then switch
to SWAP_CLUSTER_MAX in the retries.
This acknowledges that while the page allocator and memory.max reclaim
every time an allocation is made, memory.high is currently batched and
can have larger targets. We want the allocating thread to reclaim at
least the batch size, but beyond that only what's necessary to prevent
premature OOM or failing containment.
Add a comment stating as much.
Once we reclaim memory.high synchronously instead of batched, this
exceptional handling is no longer needed and can be deleted again.
Does that sound reasonable?