Re: Time from regression report to a merge of a fix (was Re: [git pull] drm fixes for 7.0-rc1)

From: Dave Airlie

Date: Tue Feb 24 2026 - 02:06:06 EST


On Tue, 24 Feb 2026 at 16:50, Dave Airlie <airlied@xxxxxxxxx> wrote:
>
> On Mon, 23 Feb 2026 at 22:52, Thorsten Leemhuis
> <regressions@xxxxxxxxxxxxx> wrote:
> >
> > Lo!
> >
> > On 2/20/26 21:53, Dave Airlie wrote:
> > >
> > > This is the fixes and cleanups for the end of the merge window, it's
> > > nearly all amdgpu, with some amdkfd, then a pagemap core fix, i915/xe
> > > display fixes, and some xe driver fixes. Nothing seems out of the
> > > ordinary, except amdgpu is a little more volume than usual.
> > >
> > > Let me know if there are any issues,
> >
> > Well, there were two fixes in here that made me wonder if our processes
> > need some optimization to get regressions fixed at least somewhat as
> > fast as Linus wants them to be fixed[1]:
> >
> > * One fix in here was for a amdgpu regression introduced in v6.19-rc6
> > (and also affecting many stable series due to backports). The fix was
> > ready within ~2 days and could even have made v6.19 -- but it only
> > reached mainline through this PR on Friday. IOW: After two weeks. Which
> > got me wondering, "Should we do something to merge fixes like that
> > faster"? And yes, it's the merge window – but that's also when Arch
> > Linux and openSUSE Tumbleweed usually jump to the latest mainline series
> > and thus expose regressions like this to many users, so I guess it would
> > be good to get them fixed at least as fast as outside of merge windows.

I think due to the patch pipeline depth and volume that amdgpu and
i915/xe are dealing with we may need to consider some better
regression revert pipelines,

The problem is most patches get fed into the start of their -next
pipelines, where CI etc picks up on them, but there isn't enough
urgency to create separate trees or pulls outside the regular fixes
ones.

The amdgpu one definitely should have been fixed in 6.19, Alex any
idea how we can alleviate that sort of problem, esp if a bug has
multiple regression reports.

> >
> > * One fix in here was for a i915/xe regression introduced in v6.18-rc1.
> > Once reported, it took about six weeks to get fixed – and then nearly 10
> > days for the fix to reach mainline. Looking at this, I once more
> > wondered if this could have been merged faster. But even more I wondered
> > why the culprit wasn't reverted, as that's what Linus afaics wants when
> > it takes this long.

I think for the i915 one the problem patch should have been reverted
asap, but I just don't think there was a responsible person to do it,
maintainers need to be in the loop for these sort of problems, but if
they aren't in the loop and the regression sits in the bug tracker
without them being looped on it, we rely on the reporter or developer
to find it and do the right thing. Esp when developers are head down
on fixing it, but then it doesn't get flagged as urgent once it goes
into the -next pipeline and so on. We don't usually break the weekly
fixes cycles for drm, because CI backlogs and other things it just
fits nicely, if we do have regressions like this it might be that we
need to start having urgent PRs out of cycle, which I don't object to,
it's just a matter of whether maintainers can have this sort of
insight into patches in the pipeline when there is quite a long
backlog.

Thanks for the detailed analysis, I've cc'ed some more Intel people.

Dave.