Re: [PATCH] mm/damon: fix stale TLB young-state handling on arm64

From: Kunwu Chan

Date: Sun May 31 2026 - 08:17:12 EST

Hi SJ,

[...]

> Thank you for clarifying!
>
> wss_estimation increases its working set size up to 160 MiB for this issue.
> Seems your test machine has large TLB buffer. I think we should decide the
> limit based on the real running system configuration and apply similar approach
> to other tests including the apply_interval.
>
> For out-of-tree tests, we may better to provide a guideline, too. E.g., run
> this sort of test program with this DAMON config to find the reliable test
> working set size on your setup.
>
Thanks for the detailed analysis.

We agree that improving the documentation and aligning the tests is a
better direction for now. Adding an optional feature mainly for testing
could confuse users and create additional maintenance burden.

We've already done some selftests:
https://lore.kernel.org/damon/20260531085633.48626-1-kunwu.chan@xxxxxxxxx/

[...]
> >
> I was thinking this again. I still want DAMON to be easy to test. But, is
> this making tests that difficult? Users could increase the test working set
> size. I'm not very sure that is too diifficult to add new optional feature.
> Meanwhille, adding an optional feature for only test might make users be
> confused. DAMON usage might also be diverged and add maintenance burdens.
>
> So, now I think another option is improving the documentation. It shouldd
> clearly explain how and why DAMON does not flush TLB and what is the expected
> problems (in tests) and recommendation. In this option, we should also update
> existing DAMON tests to be reliable and aligned with the documented
> recommendation. If we find it becomes a problem on testing even after applying
> the recommendation, or on production, we can revisit.
>
> Regardless of the decision about the optional feature in DAMON, I think such
> documentation and tests improvement should be made.
>
> Maybe I'm biased, so any input would be appreicatedd. What do you think, Kunwu
> and Lian?
>
We think that makes sense. Explaining the rationale for not flushing TLB,
the limitations this can introduce for tests, and recommendations for
choosing reliable test working set sizes would be helpful.

If we later find cases where the documented recommendations are still
insufficient, we can revisit more intrusive approaches.

I'd be happy to help with the documentation work if needed. Please let
me know if you'd like me to prepare a draft patch.

Thanks,
Kunwu

> Thanks,
> SJ
>
> [...]
>