Re: Lots of SCSI-disks, how?!

Richard Gooch (Richard.Gooch@atnf.CSIRO.AU)
Thu, 23 Apr 1998 19:56:08 +1000


Rogier Wolff writes:
> Richard Gooch wrote:
> >
> > Rogier Wolff writes:
> > > Richard Gooch wrote:
> > > > >
> > > > > Don't you? I didn't think that devfs removed the need for more space
> > > > > in device numbers internally - I simply thought it made the user
> > > > > interface far easier.
> > > >
> > > > When you call <devfs_register>, you pass an arbitrary pointer. When
> > > > your driver f_op->open() method is called, filp->private_data is
> > > > initialised with this arbitrary pointer. That effectively gives you a
> > > > 32 bit minor number on 32 bit systems. You don't need to worry about
> > > > the major number, because that is implicit in the f_op table you
> > > > passed to <devfs_register>.
> > >
> > > So, find is given the -xdev option. It stats all the files, and checks
> > > to see wether the files are still on the same mountpoint as the
> > > directories on the commandline.
> > >
> > > So, tar is packing an archive, and it wants to detect hardlinks to
> > > files that already are in the archive. To do this, it makes a list of
> > > <devno, inum, filename> triplets for the files that it's already put
> > > into the archive.
> > >
> > > Richard, please explain to me why this is going to work on a libc5
> > > based system?
> >
> > OK, please note that I'm not saying that we definately have to stick
> > to 16 bit device numbers. My main point is that I think we can avoid
> > increasing them if we so choose.
> >
> > Now to answer your question. Given that device numbers have no
> > particular meaning for devfs, you could have devfs automatically
> > assign a unique device number for each entry (that's a full 16 bit
> > number). You could even restrict it so that only disc partitions get
> > these automagic device numbers, if you wanted. That gives 64k
> > partitions, which seems like a generous amount :-)
>
> Besides that you propose to turnover lots of device drivers (which now
> themselves do something to get at the major/minor number.), this
> might work.
>
> It now turns out that "the work is already done" is not true as you
> claimed before.

No, I *did not* claim that the work (for > 16 SCSI discs) was already
done. I'll append the entire message in which I used the words "So why
not use devfs? The work is done". This was a response to the previous
paragraph written by David Woodhouse, where he wanted to bring "in a
more sensible SCSI naming/numbering scheme". I did not mean to imply
in any way that support for > 16 SCSI discs is already done (actually
Jakub *has* done something, but I don't think it uses devfs: it
requires new syscalls and modified userspace tools, something you
could avoid with devfs).

Please be careful when implying someone has lied/stretched the truth,
especially in a public forum.

There is *no way* I could make the claim that devfs already supports
over 16 SCSI discs. It would be clearly false.

What I do claim that is with a minor tweak to devfs and some slightly
less minor treaks to certain drivers, we can have support for gobs of
SCSI discs without breaking userspace.

> You'd have to add dynamic device number allocation (ask Peter for say
> 4 consecutive majors). Then you can allocate every partition a dynamic
> major/minor (1000 partitions should be enough for a while. Either you
> have lots of partitions or you have lots of disks, most likely not
> both).

I don't see why even this is neccessary. Provided all disc drivers use
devfs only if CONFIG_DEVFS is enabled, then I don't think having
duplicate device numbers (i.e. one used in devfs may also be used by a
conventional device node) would be a problem anyway.

> So now we have
> mount /devfs/xxyyzz /usr
> which uses a dynamic (16 bit) major/minor number.

Actually, the mount doesn't really need it: it uses <devfs_fill_file>
to get the f_op methods. Unless you're referring to some other use of
the device number?

> Next we have a find /usr -xdev ... which gets that dynamic major/minor
> number in every stat that it does...
>
> But we still have the
>
> unsigned int minor = MINOR(rq->rq_dev), unit = minor >> PARTN_BITS;
>
> in do_request in ide.c .

Erm, I not entirely sure what you're driving at here. If you mean that
drivers will need to be modified, yes, that's true. If we say keep 6
bits for partition number, that gives 10 bits for the disc
identifier. A whopping 1024 discs.

If a disc driver wished to, it could drop the use of the minor number
entirely and use the arbitrary pointer mechanism that devfs
provides. That would give an effective 32 bit minor, which would make
bit manipulations for accessing SCSI parameters (h,c,t,u,p)
easier. This would of course be a more invasive change for a driver.

Regards,

Richard....

>To: Richard Gooch <rgooch@vindaloo.atnf.csiro.au>
>Cc: David Woodhouse <Dave@imladris.demon.co.uk>,
drepper@cygnus.com (Ulrich Drepper),
"Fredrik Lindgren" <fredrik@tfi.net>,
linux-kernel@vger.rutgers.edu
>Subject: Re: Lots of SCSI-disks, how?!
>In-Reply-To: <199804220317.NAA07561@vindaloo.atnf.CSIRO.AU>
>References: <E0yRpwV-0001ad-00@imladris.demon.co.uk>
<199804220317.NAA07561@vindaloo.atnf.CSIRO.AU>
Richard Gooch writes:
> David Woodhouse writes:
> >
> > Richard.Gooch@atnf.CSIRO.AU said:
> > > I don't see how you could increase the size of i_rdev in the kernel's
> > > struct inode without breaking C library compatibility, since the
> > > library expects i_rdev to be 16 bits followed by i_size.
> >
> > Translate it on syscall entry/exit iff current->personality == PER_LIBC5
>
> OK, now I see.
>
> > Entry's simple, but translating back from 64-bit to 16-bit on exit may take
> > some thought. Which syscalls return an inode?
>
> Offhand, stat, fstat and lstat.
>
> > Given that most programs which require such low level access are
> > available in source form, it'll be rare that this overhead is
> > incurred - most processes which use the affected syscalls will be
> > recompiled with glibc.
>
> Once again, you don't have to do any of this at all with devfs.
>
> > > Finally, I don't see how your scheme can work to support more SCSI
> > > minors. Unless you have a whole new SCSI driver on one of the "high"
> > > majors. Yuk. That gets back to the problem of "blocking" libc 5 access
> > > to some of your discs.
> >
> > Add a second major # for the SCSI system, following one of the more sensible
> > schemes suggested. Keep the old major/minor numbers, but deprecate them in
> > much the same way as the cua devices.
>
> This does look a bit painful.
>
> > This way, legacy programs can still have access to the devices, and
> > the new system is available for anything which wants to use it. Your
> > devfs will also help legacy programs to use new devices, if
> > necessary.
> >
> > Bringing in a more sensible SCSI naming/numbering scheme has moved
> > up on my list of priorities - I trashed my MBR earlier this evening
> > because I tried to use my hard drive as a scanner :)
>
> So why not use devfs? The work is done.
>
> Regards,
>
> Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu