Re: [PATCH] ntp: Make sure RTC is synchronized when time goes backwards

From: Benjamin ROBIN
Date: Sun Sep 08 2024 - 07:45:49 EST


On Sat, Sep 07, 2024 at 11:42:55PM GMT, Thomas Gleixner wrote:
> s/The "sync_hw_clock"/sync_hw_clock()/
>
> See: https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#function-references-in-changelogs
>
> s/the next .../the timer expires late./
>
> And then please explain what the consequence is when it expires
> late. See the changelog section of the above link.

> s/This patch cancels/Cancel/
>
> For explanation:
> # git grep 'This patch' Documentation/process

Thank you for your remarks and the time spent to review this commit!

> Did you test this with lockdep enabled?

I did not... Indeed this is an huge mistake. Sorry!

> The caller holds timekeeping_lock and has the time keeper sequence write
> held. There is an existing dependency chain which is invers. Assume the
> sync_hrtimer is queued on a different CPU, CPU1 in this example:
>
> CPU 0 CPU1
>
> lock(&timekeeper_lock); lock_hrtimer_base(CPU1);
>
> write_seqcount_begin(&tk_core.seq); <- Makes tk_core.seq odd
>
> __do_adjtimex()
> process_adjtimex_modes() base->get_time()
> hrtimer_cancel() do {
> hrtimer_try_to_cancel() seq = read_seqcount_begin(&tk_core.seq);
> lock_hrtimer_base(CPU1); ^^^
> ^^^ deadlock Spin waits for tk_core.seq
> to become even
>
> You can do that cancel only outside of timekeeper_lock:

Again thank you for the time spent to explain this in some much detail.
You did not have to. This is really appreciated.

> Now you can fix that up in ntp_notify_cmos_timer() which is outside of
> the timekeeper_lock held region for the very same reason and it's the
> proper place to do that.

I will cancel the timer even for time jump in the future, everything will be
explained in the commit message. Will see if you are OK with that.

>
> Thanks,
>
> tglx

Thanks, Benjamin