Re: [PATCHv2 2/2] super1: check and output faulty dev role
From: Jinpu Wang
Date: Wed Mar 22 2017 - 06:24:21 EST
On Tue, Mar 21, 2017 at 8:55 PM, NeilBrown <neilb@xxxxxxxx> wrote:
> On Mon, Mar 20 2017, Gioh Kim wrote:
>
>> From: Jack Wang <jinpu.wang@xxxxxxxxxxxxxxxx>
>>
>> Output the real dev role in examine_super1, it will help to
>> find problem.
>>
>> Signed-off-by: Jack Wang <jinpu.wang@xxxxxxxxxxxxxxxx>
>> Reviewed-by: Gioh Kim <gi-oh.kim@xxxxxxxxxxxxxxxx>
>> ---
>> super1.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/super1.c b/super1.c
>> index f3520ac..c903371 100644
>> --- a/super1.c
>> +++ b/super1.c
>> @@ -501,8 +501,10 @@ static void examine_super1(struct supertype *st, char *homehost)
>> #endif
>> printf(" Device Role : ");
>> role = role_from_sb(sb);
>> - if (role >= MD_DISK_ROLE_FAULTY)
>> - printf("spare\n");
>> + if (role == MD_DISK_ROLE_SPARE)
>> + printf("Spare\n");
>> + else if (role == MD_DISK_ROLE_FAULTY)
>> + printf("Faulty\n");
>> else if (role == MD_DISK_ROLE_JOURNAL)
>> printf("Journal\n");
>> else if (sb->feature_map & __cpu_to_le32(MD_FEATURE_REPLACEMENT))
>> --
>> 2.5.0
>
> I don't think the distinction between "faulty" and "spare" is really
> useful here. I used to report the difference and it turned out to be
> confusing, so we stopped.
>
> This is information stored on some other disk, not the one that is
> spare-or-faulty. All it needs to know if what other devices are
> working. It doesn't need to know about which devices aren't working and
> why.
> The distinction between 'faulty' and 'spare' is only relevant to the
> device itself, and to the array as a whole.
>
> We should probably get rid of the distinction between
> MD_DISK_ROLE_FAULTY and MD_DISK_ROLE_SPARE.
> Most places that test for it just test >= MD_DISK_ROLE_FAULTY.
>
> NeilBrown
The reason why I did this change, was during debugging the problem, we
notice the dev_role was wrong, but
when I print in mdadm it said 'spare', which lead me to check other
kernel code path, so spent more time until, I found
examine_super1, treat >=MD_DISK_ROLE_FAULTY as 'spare'.
I thought if the output was right, it could have saved me or maybe
also other developer some time.
But if this cause confusing in the past, we can drop it, the first is
the real bugfix.
Thanks!
--
Jack Wang
Linux Kernel Developer