Re: [f2fs-dev] [PATCH] f2fs: fix to freeze GC and discard threads quickly

From: Daeho Jeong

Date: Thu Mar 12 2026 - 11:28:12 EST

On Wed, Mar 11, 2026 at 6:27 PM Chao Yu <chao@xxxxxxxxxx> wrote:
>
> On 2026/3/12 00:00, Daeho Jeong wrote:
> > On Wed, Mar 11, 2026 at 7:59 AM Chao Yu <chao@xxxxxxxxxx> wrote:
> >>
> >> On 2026/3/11 04:49, Daeho Jeong wrote:
> >>> From: Daeho Jeong <daehojeong@xxxxxxxxxx>
> >>>
> >>> Suspend can fail if kernel threads do not freeze for a while.
> >>> f2fs_gc and f2fs_discard threads can perform long-running operations
> >>> that prevent them from reaching a freeze point in a timely manner.
> >>>
> >>> This patch adds explicit freezing checks in the following locations:
> >>> 1. f2fs_gc: Added a check at the 'retry' label to exit the loop quickly
> >>> if freezing is requested, especially during heavy GC rounds.
> >>> 2. __issue_discard_cmd: Added a 'suspended' flag to break both inner and
> >>> outer loops during discard command issuance if freezing is detected
> >>> after at least one command has been issued.
> >>> 3. __issue_discard_cmd_orderly: Added a similar check for orderly discard
> >>> to ensure responsiveness.
> >>>
> >>> These checks ensure that the threads release locks safely and enter the
> >>> frozen state.
> >>>
> >>> Signed-off-by: Daeho Jeong <daehojeong@xxxxxxxxxx>
> >>> ---
> >>> fs/f2fs/gc.c | 4 ++++
> >>> fs/f2fs/segment.c | 14 ++++++++++++--
> >>> 2 files changed, 16 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>> index 981eac629fe9..fdc3366c4db3 100644
> >>> --- a/fs/f2fs/gc.c
> >>> +++ b/fs/f2fs/gc.c
> >>> @@ -1962,6 +1962,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control)
> >>> goto stop;
> >>> }
> >>> retry:
> >>> + if (unlikely(freezing(current))) {
> >>> + ret = 0;
> >>> + goto stop;
> >>> + }
> >>
> >> Do we need to check freezing() during multiple segments migration?
> >> especially in large section, e.g. zufs case.
> >
> > Otherwise, we can't meet the 1 second suspend requirement for Android.
> > This logic mainly targets zufs proactive GC cases.
> > Plus, aren't the remaining segments in the section the next victims of
> > GC for the next round?
>
> Sorry, I didn't get the point, could you please explain more about your concern?
>
> Actually, what I mean is if we missed freezeing() check condition in f2fs_gc(),
> in do_garbage_collection(), after we migrated one segment of section, and before
> migrate next segment in section, we can check freezing() condition at this time?
>
> I meant maybe we can add more check spots in do_garbage_collection().

Oh, I misunderstood your point earlier.

I agree that adding more check points inside do_garbage_collection()
would be beneficial, especially for cases with large sections like
zufs.

Thanks,

>
> Thanks,
>
> >
> >>
> >>> ret = __get_victim(sbi, &segno, gc_type, gc_control->one_time);
> >>> if (ret) {
> >>> /* allow to search victim from sections has pinned data */
> >>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >>> index e9b6d774b985..a6c82ab28288 100644
> >>> --- a/fs/f2fs/segment.c
> >>> +++ b/fs/f2fs/segment.c
> >>> @@ -1606,6 +1606,9 @@ static void __issue_discard_cmd_orderly(struct f2fs_sb_info *sbi,
> >>> if (dc->state != D_PREP)
> >>> goto next;
> >>>
> >>> + if (*issued > 0 && unlikely(freezing(current)))
> >>> + break;
> >>> +
> >>> if (dpolicy->io_aware && !is_idle(sbi, DISCARD_TIME)) {
> >>> io_interrupted = true;
> >>> break;
> >>> @@ -1645,6 +1648,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> >>> struct blk_plug plug;
> >>> int i, issued;
> >>> bool io_interrupted = false;
> >>> + bool suspended = false;
> >>>
> >>> if (dpolicy->timeout)
> >>> f2fs_update_time(sbi, UMOUNT_DISCARD_TIMEOUT);
> >>> @@ -1675,6 +1679,11 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> >>> list_for_each_entry_safe(dc, tmp, pend_list, list) {
> >>> f2fs_bug_on(sbi, dc->state != D_PREP);
> >>>
> >>> + if (issued > 0 && unlikely(freezing(current))) {
> >>> + suspended = true;
> >>> + break;
> >>> + }
> >>> +
> >>> if (dpolicy->timeout &&
> >>> f2fs_time_over(sbi, UMOUNT_DISCARD_TIMEOUT))
> >>> break;
> >>> @@ -1694,11 +1703,12 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> >>> next:
> >>> mutex_unlock(&dcc->cmd_lock);
> >>>
> >>> - if (issued >= dpolicy->max_requests || io_interrupted)
> >>> + if (issued >= dpolicy->max_requests || io_interrupted ||
> >>> + suspended)
> >>> break;
> >>> }
> >>>
> >>> - if (dpolicy->type == DPOLICY_UMOUNT && issued) {
> >>> + if (dpolicy->type == DPOLICY_UMOUNT && issued && !suspended) {
> >>
> >> If we're umounting data partition, it doesn't need to consider suspend?
> >
> > Makes sense.
> >
> >>
> >> Thanks,
> >>
> >>> __wait_all_discard_cmd(sbi, dpolicy);
> >>> goto retry;
> >>> }
> >>
>