Re: Observation of a memory leak with commit 314001f0bf92 ("af_unix: Add OOB support")
From: Jakub Kicinski
Date: Mon Jan 10 2022 - 11:29:55 EST
On Mon, 10 Jan 2022 17:19:56 +0100 Lukas Bulwahn wrote:
> It's a regression if some application or practical use case running fine on one
> Linux kernel works worse or not at all with a newer version compiled using a
> similar configuration.
>
> The af_unix functionality without oob support works before
> 314001f0bf92 ("af_unix: Add OOB support").
> The af_unix functionality without oob support works after 314001f0bf92
> ("af_unix: Add OOB support").
> The af_unix with oob support after the new feature with 314001f0bf92
> ("af_unix: Add OOB support") makes a memory leak visible; we do not
> know if this feature even triggers it or just makes it visible.
>
> Now, if we disable oob support we get a kernel without an observable
> memory leak. However, oob support is added by default, and this makes
> this memory leak visible. So, if oob support is turned into a
> non-default option or nobody ever made use of the oob support before,
> it really does not count as regression at all. The oob support did not
> work before and now it works but just leaks a bit of memory---it is
> potentially a bug, but not a regression. Of course, maybe oob support
> is also just intended to make this memory leak observable, who knows?
> Then, it is not even a bug, but a feature.
I see, thanks for the explanation. It wasn't clear from the "does not
repro on 5.15, does repro on net-next" type of wording that the repro
actually uses the new functionality.
> Thorsten's database is still quite empty, so let us keep tracking the
> progress with regzbot. I guess we cannot mark "issues" in regzbot as a
> true regression or as a bug (an issue that appears with a new
> feature).
>
> Also, this reproducer is automatically generated, so it barely
> qualifies as "some application or practical use case", but at best as
> some derived "stress test program" or "micro benchmark".
>
> The syzbot CI and kernel CI database are also planning to track such
> things (once all databases and all the interfaces all work smoothly),
> so in the long term, such issues as this one would not qualify for
> regzbot. For now, many things in these pipelines are still manual and
> hence, triggering and investigation is manual effort, as well as
> manually informing the involved developers, which also means that
> tracking remains manual effort, for which regzbot is probably the
> right new tool for now.
Right, that was my thinking.
> We will learn what should go into regzbot's tracker and what should
> not---as we move on in the community: various information from other
> systems (syzbot, kernel CI, kernel test robot etc.) and their reports
> are also still difficult to add, find, track, bisect etc.