Re: FatELF patches...

From: Valdis . Kletnieks
Date: Tue Nov 03 2009 - 09:54:44 EST


On Mon, 02 Nov 2009 10:14:15 EST, "Ryan C. Gordon" said:

> I probably wasn't clear when I said "distribution-wide policy" followed by
> a "then again." I meant there would be backlash if the distribution glued
> the whole system together, instead of just binaries that made sense to do
> it to.

OK.. I'll bite - which binaries does it make sense to do so? Remember in
your answer to address the very valid point that any binaries you *don't*
do this for will still need equivalent hand-holding by the package manager.
So if you're not doing all of them, you need to address the additional
maintenance overhead of "which way is this package supposed to be built?"
and all the derivative headaches.

It might be instructive to not do a merge of *everything* in Ubuntu as you
did, but only select a random 20% or so of the packages and convert them
to FatELF, and see what breaks. (If our experience with 'make randconfig'
in the kernel is any indication, you'll hit a *lot* of corner cases and
pre-reqs you didn't know about...)

> > Actually, they can't nuke the /lib{32,64} directories unless *all* binaries
> > are using FatELF - as long as there's any binaries doing things The Old Way,
> > you need to keep the supporting binaries around.
>
> Binaries don't refer directly to /libXX, they count on ld.so to tapdance
> on their behalf. My virtual machine example left the dirs there as
> symlinks to /lib, but they could probably just go away directly.

Only if all your shared libs (which are binaries too) have migrated to FatELF.

On my box, I have:

% ls -l /usr/lib{,64}/libX11.so.6.3.0
-rwxr-xr-x 1 root root 1274156 2009-10-06 13:49 /usr/lib/libX11.so.6.3.0
-rwxr-xr-x 1 root root 1308600 2009-10-06 13:49 /usr/lib64/libX11.so.6.3.0

You can't dump them both into /usr/lib without making it a FatElf or doing
some name mangling. You probably didn't notice because you merged *all* of
an ubuntu distro into FatELF.

> > Don't forget you take that hit once for each shared library involved. Plus
>
> That happens in user space in ld.so, so it's not a kernel problem in any
> case, but still...we're talking about, what? Twenty more branch
> instructions per-process?

No, a lot more than that - you already identified an extra 128-byte read
as needing to happen. Plus syscall overhead.

> > Or will a FatELF glibc.so screw up somebody's refcounts if it's mapped
> > in both 32 and 64 bit modes?
>
> Whose refcounts would this screw up? If there's a possible bug, I'd like
> to make sure it gets resolved, of course.

That's the point - nobody's done an audit for such things. Does the kernel
DTRT when counting mapped pages (probably close-to-right, if you got it to boot)?
Where are the corresponding patches, if any, for tools like perf and oprofile?
Does lsof DTRT? /proc/<pid>/pagemap? Any other tools that may break because
the make an assumption that executable files are mapped as 32-bit or 64-bit,
but not both (most likely choking if they see a 64-bit address someplace
after they've decided the binary is a 32-bit)?

Attachment: pgp00000.pgp
Description: PGP signature