Re: [PATCH] lightnvm: pblk: extend line wp balance check

From: Javier GonzÃlez
Date: Tue Jan 29 2019 - 06:19:41 EST



> On 29 Jan 2019, at 09.47, hans@xxxxxxxxxxxxx wrote:
>
> From: Hans Holmberg <hans.holmberg@xxxxxxxxxxxx>
>
> pblk stripes writes of minimal write size across all non-offline chunks
> in a line, which means that the maximum write pointer delta should not
> exceed the minimal write size. Extend the line write pointer balance check
> to cover this case.
>
> Signed-off-by: Hans Holmberg <hans.holmberg@xxxxxxxxxxxx>
> ---
>
> This patch applies on top of Zhoujie's V3 of
> "lightnvm: pblk: ignore bad block wp for pblk_line_wp_is_unbalanced
>
> drivers/lightnvm/pblk-recovery.c | 60 ++++++++++++++++++++------------
> 1 file changed, 37 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 02d466e6925e..d86f580036d3 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -302,41 +302,55 @@ static int pblk_pad_distance(struct pblk *pblk, struct pblk_line *line)
> return (distance > line->left_msecs) ? line->left_msecs : distance;
> }
>
> -static int pblk_line_wp_is_unbalanced(struct pblk *pblk,
> - struct pblk_line *line)
> +/* Return a chunk belonging to a line by stripe(write order) index */
> +static struct nvm_chk_meta *pblk_get_stripe_chunk(struct pblk *pblk,
> + struct pblk_line *line,
> + int index)
> {
> struct nvm_tgt_dev *dev = pblk->dev;
> struct nvm_geo *geo = &dev->geo;
> - struct pblk_line_meta *lm = &pblk->lm;
> struct pblk_lun *rlun;
> - struct nvm_chk_meta *chunk;
> struct ppa_addr ppa;
> - u64 line_wp;
> - int pos, i, bit;
> + int pos;
>
> - bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
> - if (bit >= lm->blk_per_line)
> - return 0;
> - rlun = &pblk->luns[bit];
> + rlun = &pblk->luns[index];
> ppa = rlun->bppa;
> pos = pblk_ppa_to_pos(geo, ppa);
> - chunk = &line->chks[pos];
>
> - line_wp = chunk->wp;
> + return &line->chks[pos];
> +}
>
> - for (i = bit + 1; i < lm->blk_per_line; i++) {
> - rlun = &pblk->luns[i];
> - ppa = rlun->bppa;
> - pos = pblk_ppa_to_pos(geo, ppa);
> - chunk = &line->chks[pos];
> +static int pblk_line_wps_are_unbalanced(struct pblk *pblk,
> + struct pblk_line *line)
> +{
> + struct pblk_line_meta *lm = &pblk->lm;
> + int blk_in_line = lm->blk_per_line;
> + struct nvm_chk_meta *chunk;
> + u64 max_wp, min_wp;
> + int i;
>
> - if (chunk->state & NVM_CHK_ST_OFFLINE)
> - continue;
> + i = find_first_zero_bit(line->blk_bitmap, blk_in_line);
> +
> + /* If there is one or zero good chunks in the line,
> + * the write pointers can't be unbalanced.
> + */
> + if (i >= (blk_in_line - 1))
> + return 0;
>
> - if (chunk->wp > line_wp)
> + chunk = pblk_get_stripe_chunk(pblk, line, i);
> + max_wp = chunk->wp;
> + if (max_wp > pblk->max_write_pgs)
> + min_wp = max_wp - pblk->max_write_pgs;
> + else
> + min_wp = 0;
> +
> + i = find_next_zero_bit(line->blk_bitmap, blk_in_line, i + 1);
> + while (i < blk_in_line) {
> + chunk = pblk_get_stripe_chunk(pblk, line, i);
> + if (chunk->wp > max_wp || chunk->wp < min_wp)
> return 1;
> - else if (chunk->wp < line_wp)
> - line_wp = chunk->wp;
> +
> + i = find_next_zero_bit(line->blk_bitmap, blk_in_line, i + 1);
> }
>
> return 0;
> @@ -362,7 +376,7 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
> int ret;
> u64 left_ppas = pblk_sec_in_open_line(pblk, line) - lm->smeta_sec;
>
> - if (pblk_line_wp_is_unbalanced(pblk, line))
> + if (pblk_line_wps_are_unbalanced(pblk, line))
> pblk_warn(pblk, "recovering unbalanced line (%d)\n", line->id);
>
> ppa_list = p.ppa_list;
> --
> 2.17.1

If I am understanding correctly, you want to protect against the case
where a pfail has broken the ws_min partition of a chunk, right? I say
this because there is a guarantee that the minimal write size and pblk's
write size align with ws_min and ws_opt. Even if we have grown bad
blocks on pfail for the current line (which is a bigger problem because
we have potentially lost data), this guarantee would remain.

If this is the case, my first reaction would be to say that the
controller is responsible for guaranteeing atomicity for both scalar and
vector I/Os. If this is not guaranteed, we have bigger problems than
this (e.g., for the write error recovery path).

Are you thinking of a different case?

Javier

Attachment: signature.asc
Description: Message signed with OpenPGP