Re: [PATCH] mm/damon/core: avoid use of half-online-committed context

From: SeongJae Park

Date: Fri Mar 20 2026 - 22:16:49 EST

On Thu, 19 Mar 2026 19:48:49 -0700 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Thu, 19 Mar 2026 07:52:17 -0700 SeongJae Park <sj@xxxxxxxxxx> wrote:
>
> > One major usage of damon_call() is online DAMON parameters update. It
> > is done by calling damon_commit_ctx() inside the damon_call() callback
> > function. damon_commit_ctx() can fail for two reasons: 1) invalid
> > parameters and 2) internal memory allocation failures. In case of
> > failures, the damon_ctx that attempted to be updated (commit
> > destination) can be partially updated (or, corrupted from a
> > perspective), and therefore shouldn't be used anymore. The function
> > only ensures the damon_ctx object can safely deallocated using
> > damon_destroy_ctx().
> >
> > The API callers are, however, calling damon_commit_ctx() only after
> > asserting the parameters are valid, to avoid damon_commit_ctx() fails
> > due to invalid input parameters. But it can still theoretically fail if
> > the internal memory allocation fails. In the case, DAMON may run with
> > the partially updated damon_ctx. This can result in unexpected
> > behaviors including even NULL pointer dereference in case of
> > damos_commit_dests() failure [1]. Such allocation failure is arguably
> > too small to fail, so the real world impact would be rare. But, given
> > the bad consequence, this needs to be fixed.
> >
> > Avoid such partially-committed (maybe-corrupted) damon_ctx use by saving
> > the damon_commit_ctx() failure on the damon_ctx object. For this,
> > introduce damon_ctx->maybe_corrupted field. damon_commit_ctx() sets it
> > when it is failed. kdamond_call() checks if the field is set after each
> > damon_call_control->fn() is executed. If it is set, ignore remaining
> > callback requests and return. All kdamond_call() callers including
> > kdamond_fn() also check the maybe_corrupted field right after
> > kdamond_call() invocations. If the field is set, break the
> > kdamond_fn() main loop so that DAMON sill doesn't use the context that
> > might be corrupted.
>
> I guess you saw the AI review?
> https://sashiko.dev/#/patchset/20260319145218.86197-1-sj%40kernel.org

By the way, I am also doing monitoring of sashiko.dev for all DAMON patches.
It will be much easier once sashiko.dev's email feature is ready, since I
already onboarded DAMON for that.

Meanwhile, the monitoring using web browser is somewhat tedious for me, so I
just implemented an hkml feature, namely
'hkml patch sashiko_dev --thread_status'. It receives a message id of a mail,
and prints the review status/result of all patches of the thread.

E.g.,

$ hkml patch sashiko_dev --thread_status 20260319-memory-failure-mf-delayed-fix-rfc-v2-v2-0-92c596402a7a@xxxxxxxxxx
- [PATCH RFC v2 1/7] mm: memory_failure: Clarify the MF_DELAYED definition
- Reviewed (Review completed successfully.)
- [PATCH RFC v2 2/7] mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED
- Reviewed (Review completed successfully.)
- [PATCH RFC v2 3/7] mm: shmem: Update shmem handler to the MF_DELAYED definition
- Reviewed (Review completed successfully.)
- [PATCH RFC v2 4/7] mm: memory_failure: Generalize extra_pins handling to all MF_DELAYED cases
- Pending (None)
- [PATCH RFC v2 4/7] mm: memory_failure: Generalize extra_pins handling to all MF_DELAYED cases
- Reviewed (Review completed successfully.)
- [PATCH RFC v2 5/7] mm: selftests: Add shmem memory failure test
- Reviewed (Review completed successfully.)
- [PATCH RFC v2 6/7] KVM: selftests: Add memory failure tests in guest_memfd_test
- Reviewed (Review completed successfully.)
- [PATCH RFC v2 7/7] KVM: selftests: Test guest_memfd behavior with respect to stage 2 page tables
- Reviewed (Review completed successfully.)

I'm planning to implement another feature for formatting and sending the review
result and inline comments as emails, probably this weekend.

Thanks,
SJ