Re: [RFC] Per file OOM badness

From: Christian KÃnig
Date: Fri Jan 19 2018 - 03:18:05 EST


Am 19.01.2018 um 06:39 schrieb He, Roger:
Basically the idea is right to me.

1. But we need smaller granularity to control the contribution to OOM badness.
Because when the TTM buffer resides in VRAM rather than evict to system memory, we should not take this account into badness.
But I think it is not easy to implement.

I was considering that as well when I wrote the original patch set, but then decided against it at least for now.

Basically all VRAM buffers can be swapped to system memory, so they potentially need system memory as well. That is especially important during suspend/resume.


2. If the TTM buffer(GTT here) is mapped to user for CPU access, not quite sure the buffer size is already taken into account for kernel.
If yes, at last the size will be counted again by your patches.

No that isn't accounted for as far as I know.


So, I am thinking if we can counted the TTM buffer size into:
struct mm_rss_stat {
atomic_long_t count[NR_MM_COUNTERS];
};
Which is done by kernel based on CPU VM (page table).

Something like that:
When GTT allocate suceess:
add_mm_counter(vma->vm_mm, MM_ANONPAGES, buffer_size);

When GTT swapped out:
dec_mm_counter from MM_ANONPAGES frist, then
add_mm_counter(vma->vm_mm, MM_SWAPENTS, buffer_size); // or MM_SHMEMPAGES or add new item.

Update the corresponding item in mm_rss_stat always.
If that, we can control the status update accurately.
What do you think about that?
And is there any side-effect for this approach?

I already tried this when I originally worked on the issue and that approach didn't worked because allocated buffers are not associated to the process where they are created.

E.g. most display surfaces are created by the X server, but used by processes. So if you account the BO to the process who created it we would start to kill X again and that is exactly what we try to avoid.

Regards,
Christian.



Thanks
Roger(Hongbo.He)

-----Original Message-----
From: dri-devel [mailto:dri-devel-bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Andrey Grodzovsky
Sent: Friday, January 19, 2018 12:48 AM
To: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Koenig, Christian <Christian.Koenig@xxxxxxx>
Subject: [RFC] Per file OOM badness

Hi, this series is a revised version of an RFC sent by Christian KÃnig a few years ago. The original RFC can be found at https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html

This is the same idea and I've just adressed his concern from the original RFC and switched to a callback into file_ops instead of a new member in struct file.

Thanks,
Andrey

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx