Re: 2.2.16 vm fixes

From: Andrea Arcangeli (andrea@suse.de)
Date: Thu Jun 15 2000 - 16:54:22 EST


On Thu, 15 Jun 2000, Marcelo Tosatti wrote:

>On Thu, 15 Jun 2000, Andrea Arcangeli wrote:
>
>> On Wed, 14 Jun 2000, Marcelo Tosatti wrote:
>>
>> >mmap002 on 2.2.15 gets killed. mmap002 on 2.2.15 + 2.2.16's
>> >thrasing heuristic runs fine. Try it.
>>
>> If you look at the vmstat while mmap002 gets killed you'll notice the
>> machine is really out of memory and the only complain you can make
>> is that there's still a relevant amount of _dirty_ cache in the
>> machine.
>>
>> current 2.2.x vm (2.2.16 included) is not able to wait in any
>> way for the dirty buffers to get flushed to disk. _All_ the changes in
>> 2.2.16 are unrelated to such problem and if they happen to not kill
>> mmap002 it it's just by luck or because the vm is become more aggressive
>> than it should be (and being more aggressive helps the case where we are
>> not able to write throttling correctly).
>>
>> The free_before_allocate is necessary but it's not actually implemented
>> correctly. I implemented it as suggested in my email of yesterday to l-k
>> and all works just fine here as far I can tell. mtest -m 70 (on 128mbyte
>> machine) from SCT works fine as my other swap testcases. Rik, I guess your
>> machine imploded because you was increasing by mistake the
>> free_before_allocate also inside the atomic_read(&free_before_allocate).
>>
>> I found that the changes in do_try_to_free_pages are buggy because one
>> task could make the cache freeable, the other task could free all the
>> cache that now is been unmapped by the first task. The first task that
>> made the cache freeable will be killed because when it tries to free the
>> cache it won't succeed (even if the other task just freed all the cache
>> and it made enough memory free for both processes). That happened here and
>> we have to consider swap_out a progress too if we don't want to kill
>> innocent task as could instead happen now. I have an idea on how to fix
>> this right also dropping the free_before_allocate stuff but it's too
>> intrusive to do it in 2.2.x and I believe we can live without problem with
>> free_before_allocate and considering swap_out a process.
>
>Quote from the email which you suggested free_before_alocate changes:
>
> if (!(current->flags & PF_MEMALLOC)) {
> int freed;
> extern struct wait_queue * kswapd_wait;
>
> /* Somebody needs to free pages so we free some of our own. */
> if (atomic_read(&free_before_allocate)) {
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ do this check first
> current->flags |= PF_MEMALLOC;
> freed = try_to_free_pages(gfp_mask);
> current->flags &= ~PF_MEMALLOC;
> if (freed)
> goto ok_to_allocate;
> }
>
> if (nr_free_pages > freepages.high)
> goto ok_to_allocate;
>
>IMO, this change will not stop the problem of a process freeing
>pages while other one is stealing this freed pages (as you described
>above).

With the above change to the free_before_allocate logic _all_ allocations
will have to first free memory before they're allowed to allocate if
somebody is blocked freeing memory. This way it can't happen anymore that
there are 20 tasks all blocked freeing memory and in the meantime somebody
eats all the free pages they are generating for themself. It can't happen
because that "somebody" will have first to free the ram for itself before
it's allowed to eat ram.

>Could you please describe your idea which is too intrusive to go into 2.2,
>please?

Each process unmap pages and free pages putting them in a per-task private
list, not in the freelist or in the global lru. Then it allocates from the
private list (if there's no fragemntation issue at least) and then put the
rest in the freelist. I'm not sure if it worth since I guess the current
algorithm should do the trick just fine. OTOH the per-task private list of
freed pages looks robust design (and it have zero races).

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 21:00:36 EST