Re: [PATCH] mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again

From: Jan Kara
Date: Thu Apr 18 2024 - 07:04:54 EST


On Wed 17-04-24 12:33:39, Zach O'Keefe wrote:
> On Wed, Apr 17, 2024 at 4:10 AM Jan Kara <jack@xxxxxxx> wrote:
> > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > > index cd4e4ae77c40a..02147b61712bc 100644
> > > --- a/mm/page-writeback.c
> > > +++ b/mm/page-writeback.c
> > > @@ -1638,7 +1638,7 @@ static inline void wb_dirty_limits(struct dirty_throttle_control *dtc)
> > > */
> > > dtc->wb_thresh = __wb_calc_thresh(dtc);
> > > dtc->wb_bg_thresh = dtc->thresh ?
> > > - div_u64((u64)dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;
> > > + div64_u64(dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;
..
> > Thirdly, if thresholds are larger than 1<<32 pages, then dirty balancing is
> > going to blow up in many other spectacular ways - consider only the
> > multiplication on this line - it will not necessarily fit into u64 anymore.
> > The whole dirty limiting code is interspersed with assumptions that limits
> > are actually within u32 and we do our calculations in unsigned longs to
> > avoid worrying about overflows (with occasional typing to u64 to make it
> > more interesting because people expected those entities to overflow 32 bits
> > even on 32-bit archs). Which is lame I agree but so far people don't seem
> > to be setting limits to 16TB or more. And I'm not really worried about
> > security here since this is global-root-only tunable and that has much
> > better ways to DoS the system.
> >
> > So overall I'm all for cleaning up this code but in a sensible way please.
> > E.g. for these overflow issues at least do it one function at a time so
> > that we can sensibly review it.
> >
> > Andrew, can you please revert this patch until we have a better fix? So far
> > it does more harm than good... Thanks!
>
> Shall we just roll-forward with a suitable fix? I think all the
> original code actually "needed" was to cast the ternary predicate,
> like:
>
> ---8<---
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index fba324e1a010..ca1bfc0c9bdd 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1637,8 +1637,8 @@ static inline void wb_dirty_limits(struct
> dirty_throttle_control *dtc)
> * at some rate <= (write_bw / 2) for bringing down wb_dirty.
> */
> dtc->wb_thresh = __wb_calc_thresh(dtc);
> - dtc->wb_bg_thresh = dtc->thresh ?
> - div64_u64(dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;
> + dtc->wb_bg_thresh = (u32)dtc->thresh ?
> + div_u64((u64)dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;

Well, this would fix the division by 0 but when you read the code you
really start wondering what's going on :) And as I wrote above when
thresholds pass UINT_MAX, the dirty limitting code breaks down anyway so I
don't think the machine will be more usable after your fix. Would you be up
for a challenge to modify mm/page-writeback.c so that such huge limits
cannot be set instead? That would be actually a useful fix...

Honza

>
> /*
> * In order to avoid the stacked BDI deadlock we need
> ---8<---
>
> Thanks, and apologize for the inconvenience
>
> Zach
>
> > Honza
> > --
> > Jan Kara <jack@xxxxxxxx>
> > SUSE Labs, CR
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR