Re: [PATCH 0/3] build linux-next without perl
From: Rob Landley
Date: Wed Feb 27 2013 - 23:01:30 EST
On 02/27/2013 03:51:55 PM, Andrew Morton wrote:
On Tue, 26 Feb 2013 21:57:52 -0800 (PST)
Rob Landley <rob@xxxxxxxxxxx> wrote:
> Before 2.6.25 building Linux never required perl. This patch series
removes
> the requirement from basic kernel builds (tested on i686, x86_64,
arm, mips,
> powerpc, sparc, sh4, and m68k). Now updated to 3.8-rc1.
>
> Note, this removes perl from the _build_ environment, not from the
_development_
> environment. This is approximately the same logic behind "make
menuconfig"
> requiring curses but "make oldconfig" not requiring curses.
Including
> zconf.lex.c_shipped in kconfig and then requiring perl makes no
sense.
>
> ...
>
> Mostly people just copy the patches into their local projects (ala
> https://github.com/rofl0r/sabotage/tree/master/KEEP ) but I'm
reposting
> them to linux-kernel after Gentoo considered using these patches,
but didn't
> because they weren't upstream:
> https://bugs.gentoo.org/show_bug.cgi?id=421483
Sitting here scratching head wondering why you-need-perl is a problem
for anyone.
I'm scratching my head that people basically keep doing the "you go
girl!" thing at me about this patch series _off_ the list (even people
I'd expect to see here, like
https://twitter.com/jonmasters/status/301166688852901888 ) but this is
something like the dozenth time I've posted it and nobody seems to
notice. Oh well.
Can we start with the fact it's a completely gratuitous build
environment dependency, and the kernel has a history of removing those?
(I mentioned two in the message you're replying to, ncurses and lex in
oldconfig.)
This isn't even a "workaround", this is an alternate implementation
that is as simple or simpler. (Sam Ravnborg acked one of the scripts
not because he cared about perl, but because it simplified the kernel
build.)
That gentoo bug report provides some explanation: "perl was removed
from @system". But I expect other people have different reasons.
Actually, removing perl from the build environment is common in cross
compiling situations. Removing everything you _can_ from the build
environment is normal when cross compiling. This is because cross
compiling sucks:
http://landley.net/writing/docs/cross-compiling.html
Cross compiling has inherent combinatorial complexity. Native compiling
build complexity is "number of packages times number of package
versions", with an addendum that things like the compiler and libc
count as packages with different versions.
When cross compiling, you basically multiply the number of targets
you're supporting times the number of package versions you're building
times the number of hosts you're building from. I've installed distros
I'd never even _heard_ of under kvm because some bug only happened
there, but of course all the big ones break too:
http://landley.net/hg/aboriginal/rev/1532
http://landley.net/hg/aboriginal/rev/1518
http://landley.net/hg/aboriginal/rev/1318
http://landley.net/hg/aboriginal/rev/1160
It's not just the combinatorial complexity, it's also less testing in
general (most people natively compile), plus the entire configure step
is wrong at the design level for cross compiling: it asks questions
about the machine you're building on and applies those answers to the
program you're building. When host and target aren't the same, this is
at _best_ useless.
So if you're cross compiling in any remotely portable way, you need an
"airlock step", as described on pages 98-100 the slides for the old
talk I gave at Ohio LinuxFest, Flourish, and Celf, which is apparently
making the rounds again:
https://twitter.com/solardiz/status/306575964064866305
It's the same general idea as Linux From Scratch chapter 5 (you
populate a directory with just the binaries you need, and restrict the
$PATH to that) but with a minimalist twist: everything you add is a
sharp edge some package can catch on. If not now, then after the next
version upgrade.
And perl is a GIANT HAIRBALL of sharp edges in this regard. There is no
perl standard, just a single perl implementation that may or may not
have the whole of CPAN installed. (In fact the "canned values" logic in
kernel/timeconst.pl uses a giant array of precomputed values because
the installed perl may or may not have Math::BigInt might not be
available on the target. Way back in 2008 I thought this meant we had
to be able to run without that and the Math::BigInt stuff was just for
regenerating the table, but Peter said
https://lkml.org/lkml/2008/2/15/548 and didn't mind letting the user
figure out what the dependencies were when the build broke. Now apply
that to lots of other packages and guess why letting ./configure not
find perl is appealing to cross compile environments.)
I'm surprised perl doesn't get dinged more for the single
implementation. All the shell scripts in the kernel are supposed to
work with #!/bin/sh pointing to dash instead of bash, people freak when
Microsoft Word or Excel are whatever some random program parses rather
than an actual file format. But perl? Everybody remember when perl was
going to be reimplemented on top of the "parrot" engine?
(http://www.perlmonks.org/?node_id=272641) You know why that didn't
happen? Because after several years of effort they couldn't quite make
it work reliably. Getting a fresh from-scratch engine implementation to
run the existing corpus of perl code turned out to be _really_hard_.
Python's got http://wiki.python.org/moin/PythonImplementations and
there's even an embedded implementation of php (http://ph7.symisc.net/)
but perl is this one _specific_ giant hairball. If you have trouble
getting that hairball to work on a new target? Tough. (When I did
bootstrap work on Hexagon back in 2010, and built linux from scratch
natively on the result, "will perl work" was one of the big unknowns.
Luckily it only took about a week of poking and prodding to get it to
build. Didn't particularly stress it to see how _well_ it worked,
mostly because I'd carefully arranged the build to need it as little as
possible. One of x11's dependencies needed it though, off in Beyond
Linux From Scratch, and wouldn't ./configure it out the way libiconv
and such did. Don't remember which one.)
By the way, I'm not saying restricting the $PATH by itself is a
_sufficient_ airlock step. When Wolfgang Denk (the u-boot maintainer)
tried my cross compiling build environment it immediately broke in 3
different and strangely fascinating ways for him, one of which
(http://landley.net/hg/aboriginal/rev/997) evolved into an entire
environment variable whitelisting step
(http://landley.net/hg/aboriginal/rev/1175) because once it works well
enough more people try it and break it in new ways...
And no, my build system isn't special, I just use it as an example
because it's what I'm most familiar with. What an awful lot of distros
do is set up a chroot and then "env -i chroot" into it (I.E. the
Linux From Scratch approach). My build system goes to extra effort so
that no part of it requires root access on the host (which is why I
don't chroot, I run the target system under qemu instead, hence the
title of the above giant slide deck from 2008).
IOW, please better describe the motivation for this patchset.
You want more? Ok. (You asked.)
In addition to all the above, last week I gave a talk at the Linux
Foundation Embedded Linux Conference (used to be called CELF before
they ate it) on turning Android into a real build environment. I didn't
do slides this time but the outline's here:
http://landley.net/talks/celf-2013.txt
It would be really convenient if the video of that talk was up so I
could just point you at it, but you'll have to ask the Linux Foundation
when that'll be. The outline was just "notes to self" for me, lemme see
if I can summarize an hour talk in a couple paragraphs.
What I'm doing there is trying to expand Android into a full
self-hosting development environment so it can get on with being a
disruptive technology and kicking the PC up into the server space like
the minicomputer and mainframe before it. I would _very_ much like
Android to do this before iPhone does because when the S-curve of
adoptions flattens out (somewhere between 1 and 3.5 billion unit
installed base I'd guess) and the positive feedback loop of network
effects kick in, being locked out of _another_ generation of technology
by an actually COMPETENT monopolist would really suck.
The hardware to use a smartphone as a workstation is just a USB hub
with keyboard, mouse, and video adapter plugged into it; that's here
today (although USB3 makes it easier). The rest is software. But
there's a LOT of software.
This software has 4 basic parts: kernel (which works now because they
just added stuff to linux without removing anything), a command line
(I'm writing a new BSD-licensed posix command line; same general reason
I did years of work in busybox only this time it's old hat), a C
library (musl-libc.org is the leading contender), and a toolchain
(looks like llvm at the moment, I'd like to do http://landley.net/qcc
but my plate's full and there's no time. Why no time? Who is sponsoring
llvm? Who did "airplay" to put a phone display on an HDTV? You think
Steve Jobs didn't _notice_ that 8->16->32->64 bits is sustaining
technologies but mainframe->minicomputer->microcomputer->smartphone is
disruptive yet _inevitable_?)
This 4-package thing is even more simplified than my Aboriginal Linux
build (which got it down to 7 packages, but the licensing of those is
wrong -- no GPL in userspace -- so preinstalling any of it is a
violation of the Android licensing guidelines and the trademark grows
teeth so your ads have to be really horribly phrased.)
The reason you _want_ to simplify it is that Google is shipping a
billion unadministered unix systems with broadband access, which is
TERRIFYING from a security standpoint. The reason bionic and toolbox
are stubs even though uClibc and busybox both predate android isn't
_just_ licensing issues, it's that Google intentionally shipped the
minimum environment necessary to boot dalvik and get into the java
sandbox, and is minimizing the attack surface if you can manage to
escape that.
But Dalvik is this generation's version of ROM Basic: it's something
the platform has to outgrow in order to wean itself off of the previous
generation it's cross-compiled from. Once the PC became a self-hosting
development environment there was an explosion of software for it,
because you no longer needed a PDP-10 to develop for the PC, having a
PC was enough to be a developer. This is not currently the case for
phones, but it should _become_ the case.
If you then say "and to be a self-hosting system, you must add perl".
And to preinstall perl, you must audit perl for security concerns...
Can we please, please, please just remove the need for perl as part of
a self-hosting development environment instead?
P.S. These are just _my_ reasons. Dave Anders is the one who first
complained to _me_ that perl had been added as a build requirement back
in 2.6.25. I finally met him in person at the BeagleBone tutorial at
CELF last week after knowing him for years on freenode (prplague), but
like all the embedded guys he doesn't hang out here. (I just cc'd him.)
Nor do the 3 of the 4 other people who congratulated me on getting this
posted on freenode today (I convinced _one_ of them to come here and
ack the darn patch).
It'll need to be reasonably good motivation, too. Because not only do
we need to patch the kernel, we also need to *maintain* its
perl-freeness and fix up perlisms as they later get added by others.
I've already been maintaining it (and submitting it here) for 5 years
now:
https://lkml.org/lkml/2008/2/15/541
Sam Ravnborg acked the headers_install change not because of perl but
because the replacement was simpler than what it replaced. It would be
difficult to make the timeconst thing with the giant blob of
pregenerated values _worse_.
As for fixing up existing perlisms, people who use my scripts already
send you patches to remove perl dependencies in things that I don't
enable in my builds:
http://lkml.indiana.edu/hypermail/linux/kernel/0910.0/01896.html
Honestly, not the only one who does this. Sing along:
http://barb.velvet.com/humor/lurkers.html
(Which is really, really annoying. But that's embedded linux for you.
They've all written upstream off as ignoring them. You think I'm typing
an epic tl;dr at _you_ guys, it's _harder_ to get through the other
way...)
(Perhaps one way of doing this would be to disable perl in regular
builds, so even if a developer has perl installed on his machine, his
build will still fail when he invokes it. Add "PERL=/dev/null" to
some
build targets in some manner.)
http://landley.net/aboriginal/about.html
(And yes, I need to get a release out that uses 3.8 but they screwed up
interrupt routing on QEMU's arm versatile board emulation again and I
haven't had time to track it down because I've been trying to get
_this_ pushed upstream this merge window. Again. Plus I need to figure
out what I broke in powerpc userspace, and run the automated Linux From
Scratch build under qemu on all targets to make sure I haven't missed
anything else...)
Rob--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/