Re: linux-2.4.0 breaks grub install into partition

From: Andries Brouwer (aeb@veritas.com)
Date: Mon Jul 10 2000 - 21:03:10 EST


On Mon, Jul 10, 2000 at 06:25:26PM -0700, H. Peter Anvin wrote:
> Andries Brouwer wrote:
> > On Mon, Jul 10, 2000 at 03:31:12PM -0700, H. Peter Anvin wrote:
> >
> > > if GRUB is writing to /dev/hda
> > > then I would agree 100% that GRUB is buggy.
> >
> > What a strange idea. The kernel is broken.
> >
> > > you're playing on the kernel's turf by messing with a mounted filesystem
> >
> > There are also alias problems without mounted filesystems.
> >
> > It is just that the kernel fails to translate blocks on hda1 into
> > blocks on hda or vice versa. At first sight this seems quite doable.
> >
> > (Given block N on hda, walk the partition list to find the partition
> > it is in. When not found the block is hda's responsibility, otherwise
> > it now becomes block N-N0 of hda3.)

> Doesn't that seem awfully expensive to you? The reverse translation
> would be cheaper (instead of thinking of a block as block N on hda3
> think of it as N+N0 on hda, at all times throughout the kernel), but it
> would still impose an overhead -- perhaps small enough not to matter.

Yes, the reverse translation is the obvious thing to think about,
but the way I describe has no performance problems.

A quite common setup is to have a partitioned disk, with all filesystem
accesses going to a partition, so that it doesnt matter if whole-disk
access is slightly more expensive: only lilo, fdisk and the like access
the whole disk. Perhaps dd, or some backup program.
So: no cost to the common case.

A much rarer setup, also used, is to have a big database on an
unpartitioned disk. Now all accesses are to the whole disk, but the list
of partitions is empty, so walking it does not take very long.

> I noticed that the BLKPG ioctl is documented as having a
> read_all_partitions() call, which isn't implemented;

You mean: which is not part of the current kernel.
No, I only submit code that is actually used somewhere.
I have more stuff that seems obviously necessary, but as
long as nobody uses it there is no need to put it into the kernel.

> nor does /proc/partitions contain the base offset of each partition
> -- which I must say makes me wonder what use /proc/partitions was
> intended to serve at all.

It serves to give a list of partition names.
If you mount something by label or uuid, then the mount program
has to access a number of devices, examine the label there, and
when the label is found mount the device.
So, /proc/partitions provides this list of devices.

[That was the original motivation. In the meantime, there are many
uses. E.g., I often see distribution installation programs use it.]

If a program needs the partition starting point, it can use the
HDIO_GETGEO ioctl which returns a

   struct hd_geometry {
      unsigned char heads;
      unsigned char sectors;
      unsigned short cylinders;
      unsigned long start;
   };

If one needs the partition starting point from the command line,
there are various utilities (e.g., hdparm -g) that will print it.

> Nor does it seem that these exists an ioctl() to find the base
> device and offset for a partition subdevice, which I have to admit to
> being rather shocked to find out.

No. And that is an operation very often used, mostly inside
the kernel, and also the kernel has no mechanism.
One sees MKDEV(MAJOR(dev), MINOR(dev)&mask) or similar stuff
everywhere. In my big "large device number" rewrite, the minorstruct
of a subdevice has a pointer to the minorstruct of the whole device.
(Basically because MKDEV() becomes an expensive operation involving
a search and possibly a memory allocation.)

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jul 15 2000 - 21:00:12 EST