Re: PATCH v4 0/6] mm, drm/ttm, drm/xe: Avoid reclaim/eviction loops under fragmentation

From: Matthew Brost

Date: Fri May 01 2026 - 18:33:38 EST


On Fri, May 01, 2026 at 02:10:07PM -0700, Matthew Brost wrote:
> On Fri, May 01, 2026 at 01:05:57PM -0700, Kenneth Crudup wrote:
> >
> > On 5/1/26 13:00, Matthew Brost wrote:
> >
> > > So is this 7.1-rc1? It looks like new feature to 7.1 added by Dave [1] and
> > > something look off here. Thanks for pointing this out.
> >
> > Yeah. I grab his master branch daily (as of 6fe0be6dc7fa RN).
> >
> > Is this a "shoot the messenger" thing? IOW, is the reporting off, or is the
>
> I don't think I'm firing any shots.
>
> > memory usage really that high?
>
> I've been able to recreate this. It looks like accounting is correct
> until the Xe shrinker runs - every time it kicks in GPUActive grows and
> will not reduce past some new floor value. It looks like an accounting
> bug in TTM or Xe (?).
>
> Here is my output on a 8G PTL where I have intentionally triggered
> shrinker to evict at least 23875 BOs (most likey quite few more but this
> what I easily see in dmesg) after closing everything on desktop.
>
> cat /proc/meminfo | grep GPU; cat /proc/buddyinfo;
> GPUActive: 13100036 kB
> GPUReclaim: 152 kB
> Node 0, zone DMA 0 1 0 0 0 0 0 0 1 1 3
> Node 0, zone DMA32 2320 1882 1523 1238 980 740 482 275 114 88 205
> Node 0, zone Normal 9751 9343 6466 4237 2703 1162 805 420 191 145 289
>
> Let me spend a bit of time here to see if I figure out where the
> accounting goes wrong.
>

Looks like a simple accounting error in the shrinking path. Here is a fix
[1] that seems to work for me.

If you want to give a it try, that would be helpful.

Matt

[1] https://patchwork.freedesktop.org/series/165862/

> Matt
>
> >
> > (BTW, those are in 30-second intervals)
> >
> > > > ----
> > > > SwapTotal: 33554428 kB
> > > > MemTotal: 32345672 kB
> > > > GPUActive: 652640 kB
> > > > GPUReclaim: 403988 kB
> > > >
> > > > SwapTotal: 33554428 kB
> > > > MemTotal: 32345672 kB
> > > > GPUActive: 651180 kB
> > > > GPUReclaim: 406812 kB
> > > >
> > > > SwapTotal: 33554428 kB
> > > > MemTotal: 32345672 kB
> > > > GPUActive: 659004 kB
> > > > GPUReclaim: 399396 kB
> > > >
> > > > SwapTotal: 33554428 kB
> > > > MemTotal: 32345672 kB
> > > > GPUActive: 666996 kB
> > > > GPUReclaim: 392764 kB
> > > >
> > > > <some hours later>
> > > > GPUActive: 91832468 kB
> > > > SwapTotal: 33554428 kB
> > > > MemTotal: 32345672 kB
> > > > GPUReclaim: 488000 kB
> > > >
> > > > GPUActive: 91832332 kB
> > > > SwapTotal: 33554428 kB
> > > > MemTotal: 32345672 kB
> > > > GPUReclaim: 487988 kB
> > > >
> > > > GPUActive: 91869376 kB
> > > > SwapTotal: 33554428 kB
> > > > MemTotal: 32345672 kB
> > > > GPUReclaim: 486504 kB
> > > > ----
> >
> > -K
> >
> > --
> > Kenneth R. Crudup / Sr. SW Engineer, Scott County Consulting, Orange County
> > CA
> >