On Mon, Jul 10, 2000 at 06:25:26PM -0700, H. Peter Anvin wrote:
> Andries Brouwer wrote:
> > On Mon, Jul 10, 2000 at 03:31:12PM -0700, H. Peter Anvin wrote:
> >
> > > if GRUB is writing to /dev/hda
> > > then I would agree 100% that GRUB is buggy.
> >
> > What a strange idea. The kernel is broken.
> >
> > > you're playing on the kernel's turf by messing with a mounted filesystem
> >
> > There are also alias problems without mounted filesystems.
> >
> > It is just that the kernel fails to translate blocks on hda1 into
> > blocks on hda or vice versa. At first sight this seems quite doable.
> >
> > (Given block N on hda, walk the partition list to find the partition
> > it is in. When not found the block is hda's responsibility, otherwise
> > it now becomes block N-N0 of hda3.)
> Doesn't that seem awfully expensive to you? The reverse translation
> would be cheaper (instead of thinking of a block as block N on hda3
> think of it as N+N0 on hda, at all times throughout the kernel), but it
> would still impose an overhead -- perhaps small enough not to matter.
Yes, the reverse translation is the obvious thing to think about,
but the way I describe has no performance problems.
A quite common setup is to have a partitioned disk, with all filesystem
accesses going to a partition, so that it doesnt matter if whole-disk
access is slightly more expensive: only lilo, fdisk and the like access
the whole disk. Perhaps dd, or some backup program.
So: no cost to the common case.
A much rarer setup, also used, is to have a big database on an
unpartitioned disk. Now all accesses are to the whole disk, but the list
of partitions is empty, so walking it does not take very long.
> I noticed that the BLKPG ioctl is documented as having a
> read_all_partitions() call, which isn't implemented;
You mean: which is not part of the current kernel.
No, I only submit code that is actually used somewhere.
I have more stuff that seems obviously necessary, but as
long as nobody uses it there is no need to put it into the kernel.
> nor does /proc/partitions contain the base offset of each partition
> -- which I must say makes me wonder what use /proc/partitions was
> intended to serve at all.
It serves to give a list of partition names.
If you mount something by label or uuid, then the mount program
has to access a number of devices, examine the label there, and
when the label is found mount the device.
So, /proc/partitions provides this list of devices.
[That was the original motivation. In the meantime, there are many
uses. E.g., I often see distribution installation programs use it.]
If a program needs the partition starting point, it can use the
HDIO_GETGEO ioctl which returns a
struct hd_geometry {
unsigned char heads;
unsigned char sectors;
unsigned short cylinders;
unsigned long start;
};
If one needs the partition starting point from the command line,
there are various utilities (e.g., hdparm -g) that will print it.
> Nor does it seem that these exists an ioctl() to find the base
> device and offset for a partition subdevice, which I have to admit to
> being rather shocked to find out.
No. And that is an operation very often used, mostly inside
the kernel, and also the kernel has no mechanism.
One sees MKDEV(MAJOR(dev), MINOR(dev)&mask) or similar stuff
everywhere. In my big "large device number" rewrite, the minorstruct
of a subdevice has a pointer to the minorstruct of the whole device.
(Basically because MKDEV() becomes an expensive operation involving
a search and possibly a memory allocation.)
Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sat Jul 15 2000 - 21:00:12 EST