On Mon 30-11-15 18:02:33, Vlastimil Babka wrote:
[...]
So the issue I see with simply renaming __GFP_REPEAT to __GFP_BEST_AFFORD
and making it possible to fail for low orders, is that it will conflate the
new failure possibility with the existing "try harder to reclaim before
oom". As I mentioned before, "trying harder" could be also extended to mean
something for compaction, but that would further muddle the meaning of the
flag. Maybe the cleanest solution would be to have separate flags for
"possible to fail" (let's say __GFP_MAYFAIL for now) and "try harder" (e.g.
__GFP_TRY_HARDER)? And introduce two new higher-level "flags" of a GFP_*
kind, that callers would use instead of GFP_KERNEL, where one would mean
GFP_KERNEL|__GFP_MAYFAIL and the other
GFP_KERNEL|__GFP_TRY_HARDER|__GFP_MAYFAIL.
I will think about that but this sounds quite confusing to me. All the
allocations on behalf of a user process are MAYFAIL basically (e.g. the
oom victim failure case) unless they are explicitly __GFP_NOFAIL. It
also sounds that ~__GFP_NOFAIL should imply MAYFAIL automatically.
__GFP_BEST_EFFORT on the other hand clearly states that the allocator
should try its best but it can fail. The way how it achieves that is
an implementation detail and users do not have to care. In your above
hierarchy of QoS we have:
- no reclaim ~__GFP_DIRECT_RECLAIM - optimistic allocation with a
fallback (e.g. smaller allocation request)
- no destructive reclaim __GFP_NORETRY - allocation with a more
expensive fallback (e.g. vmalloc)
- all reclaim types but only fail if there is no good hope for success
__GFP_BEST_EFFORT (fail rather than invoke the OOM killer second time)
user allocations
- no failure allowed __GFP_NOFAIL - failure mode is not acceptable
we can keep the current implicit "low order imply __GFP_NOFAIL" behavior
of the GFP_KERNEL and still offer users to use __GFP_BEST_EFFORT as a
way to override it.
The second thing to consider, is __GFP_NORETRY useful? The latency savings
are quite vague. Maybe we could just remove this flag to make space for
__GFP_MAYFAIL?
There are users who would like to see some reclaim but rather fail then
see the OOM killer. I assume there are also users who can handle the
failure but the OOM killer is not a big deal for them. I think that
GFP_USER is an example of the later.