Hi Austin,Overcommit got introduced because of these, not the other way around. It's not forcing them to change, but it's also a core concept in any modern virtual memory based OS, and that's not ever going to change either.
On 05/13/2016 04:14 PM, Austin S. Hemmelgarn wrote:
On 2016-05-13 09:34, Sebastian Frias wrote:
Hi Austin,In theory, that's a great idea. In practice though, it only works if:
On 05/13/2016 03:11 PM, Austin S. Hemmelgarn wrote:
On 2016-05-13 08:39, Sebastian Frias wrote:
There's an option for the OOM-killer to just kill the allocating task instead of using the scoring heuristic. This is about as deterministic as things can get though.
My point is that it seems to be possible to deal with such conditions in a more controlled way, ie: a way that is less random and less abrupt.
By the way, why does it has to "kill" anything in that case?
I mean, shouldn't it just tell the allocating task that there's not enough memory by letting malloc return NULL?
1. The allocating task correctly handles malloc() (or whatever other function it uses) returning NULL, which a number of programs don't.
2. The task actually has fallback options for memory limits. Many programs that do handle getting a NULL pointer from malloc() handle it by exiting anyway, so there's not as much value in this case.
3. There isn't a memory leak somewhere on the system. Killing the allocating task doesn't help much if this is the case of course.
Well, the thing is that the current behaviour, i.e.: overcommiting, does not improves the quality of those programs.
I mean, what incentive do they have to properly handle situations 1, 2?
If the memory leak is in the kernel, then yes, the OOM killer won't help, period. But if the memory leak is in userspace, and the OOM killer kills the task with the leak (which it usually will if you don't have it set to kill the allocating task), then it may have just saved the system from crashing completely. Yes some user may lose some unsaved work, but they would lose that data anyway if the system crashes, and they can probably still use the rest of the system.
Also, if there's a memory leak, the termination of any task, whether it is the allocating task or something random, does not help either, the system will eventually go down, right?
Because of overcommit, it's possible for the allocation to succeed, but the subsequent access to fail. At that point, you're way past malloc() returning, and you have to do something.You have to keep in mind though, that on a properly provisioned system, the only situations where the OOM killer should be invoked are when there's a memory leak, or when someone is intentionally trying to DoS the system through memory exhaustion.
Exactly, the DoS attack is another reason why the OOM-killer does not seem a good idea, at least compared to just letting malloc return NULL and let the program fail.