Re: [PATCH] lightnvm: pblk: extend line wp balance check

From: Hans Holmberg
Date: Tue Jan 29 2019 - 07:49:16 EST


On Tue, Jan 29, 2019 at 12:19 PM Javier GonzÃlez <javier@xxxxxxxxxxx> wrote:
>
>
> > On 29 Jan 2019, at 09.47, hans@xxxxxxxxxxxxx wrote:
> >
> > From: Hans Holmberg <hans.holmberg@xxxxxxxxxxxx>
> >
> > pblk stripes writes of minimal write size across all non-offline chunks
> > in a line, which means that the maximum write pointer delta should not
> > exceed the minimal write size. Extend the line write pointer balance check
> > to cover this case.
> >
> > Signed-off-by: Hans Holmberg <hans.holmberg@xxxxxxxxxxxx>
> > ---
> >
> > This patch applies on top of Zhoujie's V3 of
> > "lightnvm: pblk: ignore bad block wp for pblk_line_wp_is_unbalanced
> >
> > drivers/lightnvm/pblk-recovery.c | 60 ++++++++++++++++++++------------
> > 1 file changed, 37 insertions(+), 23 deletions(-)
> >
> > diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> > index 02d466e6925e..d86f580036d3 100644
> > --- a/drivers/lightnvm/pblk-recovery.c
> > +++ b/drivers/lightnvm/pblk-recovery.c
> > @@ -302,41 +302,55 @@ static int pblk_pad_distance(struct pblk *pblk, struct pblk_line *line)
> > return (distance > line->left_msecs) ? line->left_msecs : distance;
> > }
> >
> > -static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
> > - struct pblk_line *line)
> > +/* Return a chunk belonging to a line by stripe(write order) index */
> > +static struct nvm_chk_meta *pblk_get_stripe_chunk(struct pblk *pblk,
> > + struct pblk_line *line,
> > + int index)
> > {
> > struct nvm_tgt_dev *dev = pblk->dev;
> > struct nvm_geo *geo = &dev->geo;
> > - struct pblk_line_meta *lm = &pblk->lm;
> > struct pblk_lun *rlun;
> > - struct nvm_chk_meta *chunk;
> > struct ppa_addr ppa;
> > - u64 line_wp;
> > - int pos, i, bit;
> > + int pos;
> >
> > - bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> > - if (bit >= lm->blk_per_line)
> > - return 0;
> > - rlun = &pblk->luns[bit];
> > + rlun = &pblk->luns[index];
> > ppa = rlun->bppa;
> > pos = pblk_ppa_to_pos(geo, ppa);
> > - chunk = &line->chks[pos];
> >
> > - line_wp = chunk->wp;
> > + return &line->chks[pos];
> > +}
> >
> > - for (i = bit + 1; i < lm->blk_per_line; i++) {
> > - rlun = &pblk->luns[i];
> > - ppa = rlun->bppa;
> > - pos = pblk_ppa_to_pos(geo, ppa);
> > - chunk = &line->chks[pos];
> > +static int pblk_line_wps_are_unbalanced(struct pblk *pblk,
> > + struct pblk_line *line)
> > +{
> > + struct pblk_line_meta *lm = &pblk->lm;
> > + int blk_in_line = lm->blk_per_line;
> > + struct nvm_chk_meta *chunk;
> > + u64 max_wp, min_wp;
> > + int i;
> >
> > - if (chunk->state & NVM_CHK_ST_OFFLINE)
> > - continue;
> > + i = find_first_zero_bit(line->blk_bitmap, blk_in_line);
> > +
> > + /* If there is one or zero good chunks in the line,
> > + * the write pointers can't be unbalanced.
> > + */
> > + if (i >= (blk_in_line - 1))
> > + return 0;
> >
> > - if (chunk->wp > line_wp)
> > + chunk = pblk_get_stripe_chunk(pblk, line, i);
> > + max_wp = chunk->wp;
> > + if (max_wp > pblk->max_write_pgs)
> > + min_wp = max_wp - pblk->max_write_pgs;
> > + else
> > + min_wp = 0;
> > +
> > + i = find_next_zero_bit(line->blk_bitmap, blk_in_line, i + 1);
> > + while (i < blk_in_line) {
> > + chunk = pblk_get_stripe_chunk(pblk, line, i);
> > + if (chunk->wp > max_wp || chunk->wp < min_wp)
> > return 1;
> > - else if (chunk->wp < line_wp)
> > - line_wp = chunk->wp;
> > +
> > + i = find_next_zero_bit(line->blk_bitmap, blk_in_line, i + 1);
> > }
> >
> > return 0;
> > @@ -362,7 +376,7 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
> > int ret;
> > u64 left_ppas = pblk_sec_in_open_line(pblk, line) - lm->smeta_sec;
> >
> > - if (pblk_line_wp_is_unbalanced(pblk, line))
> > + if (pblk_line_wps_are_unbalanced(pblk, line))
> > pblk_warn(pblk, "recovering unbalanced line (%d)\n", line->id);
> >
> > ppa_list = p.ppa_list;
> > --
> > 2.17.1
>
> If I am understanding correctly, you want to protect against the case
> where a pfail has broken the ws_min partition of a chunk, right? I say
> this because there is a guarantee that the minimal write size and pblk's
> write size align with ws_min and ws_opt. Even if we have grown bad
> blocks on pfail for the current line (which is a bigger problem because
> we have potentially lost data), this guarantee would remain.
>
> If this is the case, my first reaction would be to say that the
> controller is responsible for guaranteeing atomicity for both scalar and
> vector I/Os. If this is not guaranteed, we have bigger problems than
> this (e.g., for the write error recovery path).
>
> Are you thinking of a different case?

The write pointer check triggers a warning if something unexpected has
happened to the chunks.
i.e. if something else than pblk messed with the disk, or if the user
tries to recover a pblk instance with an invalid lun configuration.

This patch adds a warning if a chunk wp is too small(i.e. if a chunk
was unexpectedly reset)

>
> Javier
>