Re: Exporting which partitions to md-configure

From: Neil Brown
Date: Sun Feb 05 2006 - 20:45:33 EST

Next message: Shantanu Goel: "Re: [VM PATCH] rotate_reclaimable_page fails frequently"
Previous message: Shantanu Goel: "Re: [VM PATCH] rotate_reclaimable_page fails frequently"
Next in thread: Kyle Moffett: "Re: Exporting which partitions to md-configure"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Monday January 30, hpa@xxxxxxxxx wrote:
> Neil Brown wrote:
> >
> > Well, grepping through fs/partitions/*.c, the 'flags' thing is set by
> > efi.c, msdos.c sgi.c sun.c
> >
> > Of these, efi compares something against PARTITION_LINUX_RAID_GUID,
> > and msdos.c, sgi.c and sun. compare something against
> > LINUX_RAID_PARTITION.
> >
> > The former would look like
> > e6d6d379-f507-44c2-a23c-238f2a3df928
> > in sysfs (I think);
> > The latter would look like
> > fd
> > (I suspect).
> >
> > These are both easily recognisable with no real room for confusion.
>
> Well, if we're going to have a generic facility it should make sense
> across the board. If all we're doing is supporting legacy usage we
> might as well export a flag.
>
> I guess we could have a single entry with a string of the form
> "efi:e6d6d379-f507-44c2-a23c-238f2a3df928" or "msdos:fd" etc -- it
> really doesn't make any difference to me, but it seems cleaner to have
> two pieces of data in two different sysfs entries.

What constitutes 'a piece of data'? A bit? a byte?

I would say that
msdos:fd
is one piece of data. The 'fd' is useless without the 'msdos'.
The 'msdos' is, I guess, not completely useless with the fd.

I would lean towards the composite, but I wouldn't fight a separation.

>
> >
> > And if other partition styles wanted to add support for raid auto
> > detect, tell them "no". It is perfectly possible and even preferable
> > to live without autodetect. We should support legacy usage (those
> > above) but should discourage any new usage.
> >
>
> Why is that, keeping in mind this will all be done in userspace?
>

partition-type based autodetect is easily fooled.
If you take a pair of drives from a failed computer, plug them into a
similar computer for data recovery, and boot: you might have two
different pairs of drives that both want to be 'md0'. Which wins?

I believe there needs to be a clear, non ambigous, causality path from
the kernel paramters, initramfs, or machine hardware that leads to the
arrays to be assembled and hence the filesystem to be mounted. These
should identify the array by some reasonably unique identifier. The
'minor number' stored in the raid superblock and used by
partition-based autodetect is not 'reasonably unique'.

There are many many possibilites. Some are:

kernel parameter md=0,/dev/hda,/dev/hdc

'hda' and 'hdc' on 'this' machine are (I think) still unique
identifiers of hardware, and so this can identify drives to assemble
into an array. Note that this would *not* be reliable with
md=0,/dev/sda,/dev/sdb

kernel parameter md_root_uuid=xxyy:zzyy:aabb:ccdd...
This could be interpreted by an initramfs script to run mdadm
to find and assemble the array with that uuid. The uuid of
each array is reasonably unique.

initramfs containing preconfigured /etc/mdadm.conf
This mdadm.conf would contain the uuid's of the arrays to
assemble.

kernel hardware MAC address
This could be mapped through DHCP to a host name. The host name
is then given to mdadm such that it finds and assemble the array
with 'name' (only supported in version-1 superblocks) of
$HOST-root
or whatever.

Just as there is a direct unambiguous causal path from something
present at early boot to the root filesystem that is mounted (and the
root filesystem specifies all other filesystems through fstab)
similarly there should be an unambiguous causal path from something
present at early boot to the array which holds the root filesystem -
and the root filesystem should describe all other arrays via
mdadm.conf

Does that make sense?

NeilBrown

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Shantanu Goel: "Re: [VM PATCH] rotate_reclaimable_page fails frequently"
Previous message: Shantanu Goel: "Re: [VM PATCH] rotate_reclaimable_page fails frequently"
Next in thread: Kyle Moffett: "Re: Exporting which partitions to md-configure"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]