Re: [RFC][PATCH 6/9] gen_initramfs_list.sh: include xattrs

From: Mimi Zohar
Date: Tue Jan 13 2015 - 15:20:37 EST


On Tue, 2015-01-13 at 12:48 -0600, Rob Landley wrote:
> On 01/08/2015 04:08 PM, Mimi Zohar wrote:
> > On Thu, 2015-01-08 at 12:19 -0600, Rob Landley wrote:
> >>
> >> But I am curious about how you propose to encode xattrs into the cpio
> >> format. (Which Al Viro chose because it's _simple_. There isn't really
> >> a
> >> controlling spec since Posix decided to deprecated it in 2001 and
> >> yank
> >> it from SUSv3 onwards. LSB extended several header fields to 8 hex
> >> digits instead of 6, but they still have 32 bit timestamps which seems
> >> a
> >> bit short-sighted. If you're going to define a new rev with a new
> >> magic
> >> number, there are a couple other things you might wanna fix...)
> >
> > Sounds like a good opportunity to make the other changes as well. We
> > can include the other changes in this patch set. Is this (initramfs)
> > the right mailing list for this discussion?
>
> I'd forgotten there was such a list until the email came in. :)
>
> > Do other people need to be included?
>
> In theory including the Austin Group (the posix committee mailing list)
> might be useful, but in practice they hear about stuff well after the
> fact, and they washed their hands of cpio over a decade ago (shortly
> after Linux started heavily using it in rpm, and a bit before initramfs).
>
> I note that there are two data formats of interest here:
>
> 1) the cpio file layout.
>
> 2) the list of files generated by gen_initramfs_list.sh and consumed by
> gen_init_cpio.
>
> The fact you're modifying gen_initramfs_list.sh seems to imply that
> you're changing the _second_ format as well as the first...? The second
> was never actually specified, but it does get used a lot by various
> build systems and breaking it would inconvenience people. (Plus I'd need
> to update the documentation, but I need to do that anyway.)

Patch "[PATCH 6/9] gen_initramfs_list.sh: include xattrs" is a one line
change that adds the "-x" option to include xattrs in the initramfs, if
they exist. This patch makes the new method (070703) the default format.

> Ss long as you're extending the format could you add a second 32 bits of
> time data that gets glued to the top half of a uint64_t, and then store
> the actual time value in microseconds (so time*1000000)? (I'd say
> "nanoseconds" but 63 bits of nanoseconds is 292 years, which is just
> short enough I'm uncomfortable with it. I'm just optimistic enough to
> think that might inconvenience somebody.)
>
> The other "huh" is the filesize, but 4 gigs per file seems unlikely to
> be more than initramfs needs any time soon? (It's possible that RPM
> might care in 15 years or so...)

4 bytes enough?

> >> I ask because I maintain a new from-scratch cpio implementation
> >> (http://landley.net/hg/toybox/file/1571/toys/posix/cpio.c), so I'd
> >> presumably have to add your format extensions to this. Is there any
> >> sort
> >> of documentation on them?
> >>
> >> The toybox config Android is using has this cpio implementation
> >> enabled
> >> (see
> >> https://android.googlesource.com/platform/external/toybox/+/9250c95a8c47/Android.mk)
> >> so I'd rather like to get this sort of detail right...
> >
> > The xattr section, which follows the file name, is of the format:
> > <number of xattrs> { <xattr name> <xattr data size> <xattr data> } for
> > each xattr, terminated with a NULL byte and padded to a 4 byte boundary.
>
> where <value> is... 8 bytes of ascii hex digits like the header values?
> (Every cpio string is padded to a boundary. Sigh, lemme go read your
> patch...)
>
> Ok, 2/9, actual file format parsing. New magic string at the start
> "070703" for the new version. (Good, that was my first question: easy
> way to distinguish this from the previous format).
>
> - for (i = 0, s += 6; i < 12; i++, s += 8) {
> + for (i = 0; i < (!newcx ? 12 : 13); i++, s += 8) {
>
> You've tested this and the missing s+= 6 didn't cause problems? (Or did
> it move somewhere else...? Is that what the 1/9 did grabbing just the
> magic instead of the rest of the header data...?)

Right, the first patch separates reading the magic string from reading
the rest of the header.

> > The header contains an additional field, before the checksum, containing
> > the xattr section length, including the NULL byte, but without the
> > padding.
>
> Ah, the old "4 bytes of padding to align to 4 bytes" silliness. (Even
> though you can't trust the null to _be_ there and have to set it
> yourself after the read.) I'm starting to remember this...
>
> Ok, different header magic, one new 8 hex digit field at the end of the
> header (before crc) containing "xattr_bufflen". The start of this buffer
> is an 8 digit hex "num_xattrs", which you iterate through and call
> strlen() on despite never having assured that the data you read in
> actually _does_ contain a null at the end (of the entire buffer). Then
> past that supposed null is another 8 digit hex "xattr_value_size", and
> that many bytes following you then send to sys_setxattr().
>
> Except for the part about you trusting your input data waaaaay too much,
> seems reasonable?

Ok

> I have no idea what sys_setxattr() accepts, but
> presumably there's a man page for the system call...
>
> http://man7.org/linux/man-pages/man2/setxattr.2.html
>
> Ok, that's probably enough data to implement it. (Not sure why that man
> page isn't in my ubuntu 14.04 install which has manpages-dev installed?
>
> $ man setxattr
> No manual entry for setxattr
>
> > Note that gen_init_cpio does not include "security.evm" as it is file
> > system dependent.
>
> I have no idea what that is. Should I not include it in the command line
> cpio? (Or have a flag?)

Right, don't include "security.evm". Both the HMAC and signature format
include the inode number, which is filesystem specific.

> The last time I used extended attributes was on OS/2; my only
> non-academic interaction with selinux has been looking up how to switch
> it off.
>
> I still boggle that Fortune 500 bureaucracies include "must have a
> security system designed by the NSA based on the idea of complicating
> the system until there are no obvious holes, because after the Snowden
> leaks that's clearly a good idea" as part of their certification
> processes for reducing the risk of being unable to delegate blame.

I'm the linux-integrity subsystem maintainer, not an LSM maintainer (not
that there is anything wrong in being an LSM maintainer). So, I'm not
quite sure why you keep saying things like this to me. BTW, there are a
number of LSMs these days, not only SELinux.

> I'm also kind of impressed by the longevity of a hack the original Apple
> Lisa developers invented in 1981 because their OS didn't have an
> equivalent of the unix "file" command, and undoes the central unix
> "everything's a flat file" idea to bring us back to the structure
> records with magic meanings of the mainframe days.
>
> http://www.folklore.org/StoryView.py?project=Macintosh&story=The_Grand_Unified_Model_The_Finder.txt&sortOrder=Sort+by+Title
>
> Still, dictating what users can and can't do is policy. It exists,
> people use it...

> sigh. I'll try to take a stab at it some evening this week.
>
> > Mimi
>
> Rob

Thanks!

Mimi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/