Generic rbtree: compare() vs less()
From: Daniel Santos
Date: Sun Jun 03 2012 - 19:16:15 EST
Peter,
I wanted to get with you on something you mentioned in IRC, as well as
one of your emails. The (primary) reason my code uses a compare()
function instead of less() is that it has to work with trees that allow
unique keys. In the fair scheduler, it doesn't matter because:
a.) duplicate keys are allowed
b.) you never need to do lookups.
If you had to do lookups, you would have to call less() twice:
if (less(p->key, key)) {
p = p->right;
} else if (less(key, p->key)) {
p = p->right;
} else {
return p;
}
Using compare() means that you only call this function once and allows
for the possibility that some trees might have a more complication
compare function. Plus, on gcc 4.5 and earlier, it's not inlining the
compare function call. :(
The reason I'm returning long from compare() instead of returning int,
is to hopefully simplify the generated code for compare functions that
need to subtract two 64-bit values and still not incur additional
overhead. If I used int, then the compare function for a 64-bit value
would have to look like this:
int compare(u64 *a, u64 *b) {
return *a > *b ? 1 : (*a < *b ? -1 : 0);
}
// or
int compare(u64 *a, u64 *b) {
s64 diff = *a - *b;
return diff > 0 ? 1: (diff < 0 ? -1 : 0);
}
And that's a whole lot more instructions, unless the compiler can inline
the function, compared the two values, use the CPU's negative & zero
flags and just completely compiled out int return value which, sadly,
just doesn't seem to happen very often. :( (especially on 4.5 and
earlier where it fails to inline the compare function call).
Incidentally, this is what I've done on my fair.c patch:
+static inline long compare_vruntime(u64 *a, u64 *b)
+{
+#if __BITS_PER_LONG >= 64
+ return (long)((s64)*a - (s64)*b);
+#else
+/* This is hacky, but is done to reduce instructions -- we wont use
this for
+ * rbtree lookups, only inserts, and since our relationship is defined as
+ * non-unique, we only need to return positive if a > b and any other value
+ * means less than.
+ */
+ return (long)(*a > *b);
+#endif
+}
This should keep the instruction count pretty much what it was prior to
my patch on 32-bit systems as well as 64.
Anyway, please let me know if you have any other ideas on this! I'm
going to write up a Q&A or something that explains the reasons for some
of these seemingly odd decisions. (maybe I'll even find some better
alternatives after getting feedback)
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/