Re: [PATCH 2/2] UUID: Time-based RFC 4122 UUID generator

From: Theodore Tso
Date: Sun Oct 21 2007 - 08:27:15 EST


On Sat, Oct 20, 2007 at 10:00:03PM +0200, Helge Deller wrote:
> The attached patch additionally adds the "Time-based UUID" variant
> of RFC 4122 to the Linux Kernel.
> With this patch applied, userspace applications may easily get real unique
> time-based UUIDs through /proc/sys/kernel/random/uuid_time.

> The attached implementation uses getnstimeofday() to get more fine-grained
> granularity than what would be possible with gettimeofday() in userspace.

I'm not convinced this is really needed, given that we have a UUID
generation library; we can just as easily use clock_gettime() to get
nanosecond granularity in userspace. The reason why I didn't add this
in libuuid a decade ago was that we didn't have anything that was near
even microsecond resolution back then. But this is a problem we can
fix in userspace.

> Additionally a mutex takes care of the proper locking against a mistaken
> double creation of UUIDs for simultanious running processes.

This is trickier to do in userspace, given that in userspace we have
to worry about malicious code that might grab and hold the mutex for
long periods of time. It is soluble, though.

> + /* Determine clock sequence (max. 14 bit) */
> + if (unlikely(!clock_seq_initialized)) {
> + get_random_bytes(&clock_seq, sizeof(clock_seq));
> + clock_seq &= 0x3fff;
> + clock_seq_started = clock_seq;
> + clock_seq_initialized = 1;

If you're really going to do this right, it should be possible to get
and set the clock sequence from userspace, since the clock sequence is
supposed to be retained across system bootups. Otherwise, it's
possible for you to get really unlucky, and pick the same clock
sequence number just at the same point that the clock gets changed.
Not all that likely, granted, but technically it's good thing to do
and required by RFC 4122.

I don't do this in the userspace library, but again that's because of
the security issue of dealiing with multiple userids needing to update
a file. (What we really need to do this right in userspace is the
concept of lightweight privileged shared libraries, such as what
Multics had.) So if that's going to be the justification to do this
in the kernel, it should be possible to set/get the clock sequence
number, so that in an init.d script, the clock sequence numer can be
fetched at bootup and stored at shutdown. (Yeah, that still leaves
open the possibility of how to handle an unclean shutdown. If you
really wanted to be anal about guaranteeing that a clock sequence
number was never reused, any time a clock sequence number is changed,
a hal event could be sent that would cause a userspace helper script
to sample the clock sequence and update the file.)

> + /* Set the spatially unique node identifier */
> +#ifdef CONFIG_NET
> + read_lock(&dev_base_lock);
> + for_each_netdev(&init_net, d) {
> + if (d->type == ARPHRD_ETHER && d->addr_len == ETH_ALEN
> + && d != init_net.loopback_dev) {
> + memcpy(&uuid_out[10], d->dev_addr, ETH_ALEN);
> + netdev_found = 1;
> + break;
> + }
> + }
> + read_unlock(&dev_base_lock);
> +#endif

This means you have to grab the dev_base_lock each time you generate a
UUID. In addition, since this code always grabs the first ethernet
device in the list, if the list has changed --- for example, if
someone inserts a PCMCIA wireless or ethernet card, or removes the
card for power management reasons --- you could end up changing the
MAC address and forcing a clock sequence bump even though it's not
necessary. So there are a couple of things that we might do here.

First of all, if we *know* that that a particular mac address is
associated with a card which is hardwired to the machine, we're
actually better off using it even if it is no longer present in the
list. But aside from getting a userspace hint, I don't see a good way
for us to implement this.

What you probably should do, since you're keeping the MAC address
around to compare if it changes, is to search the list to see if the
MAC address is still on the list, and use it if it's there; and if it
isn't, then you can use the first ethernet address found instead.

> + if (unlikely(!netdev_found)) {
> + /* use bootid's nodeID if no network interface found */

If CONFIG_NET is undefined, the netdev_found will always be false, so
the unlikely() would be particularly inappropriate. Probably will
only matter for embedded systems, but they won't thank you....

> + /* if MAC/NodeID changed, create a new clock_seq value */
> + if (unlikely(memcmp(&uuid_out[10], &last_mac, ETH_ALEN))) {
> + memcpy(&last_mac, &uuid_out[10], ETH_ALEN);
> + /* for very first UUID, do not increment clock_seq */
> + if (unlikely(clock_seq_initialized == 1))
> + clock_seq_initialized = 2;
> + else
> + goto inc_clock_seq;
> + }

The goto loop could be much simplified if you grab the MAC/NodeID
*before* you do clock_seq logic. Right now, if the Mac/NodeID
changes, you end up having to redo a lot of processing for no good
reason (and under the uuid_mutex).

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/