Re: [PATCH 5/5] lib/list_sort: Optimize number of calls to comparison function
From: George Spelvin
Date: Sun Dec 08 2019 - 03:07:43 EST
On Sat, 22 Jun 2019 at 01:12:01, Rasmus Villemoes wrote:
> On 14/03/2019 02.58, George Spelvin wrote:
>> On Thu, 14 Mar 2019 at 00:28:16 +0100, Rasmus Villemoes wrote:
>
>>> Similarly one could do a SORT_SIMPLE_CMP() for when sorting an array of
>>> structs according to a single numeric member. That sort is not stable,
>>> so the comparison functions would have to do a full -1,0,1 return, of
>>> course, but we'd still avoid indirect calls.
>>
>> Actually, while some sorts (notably fat-pivot quicksort) require
>> a 3-way return to avoid O(n^2) performance, heapsort is just fine
>> with a boolean return value.
>
> Hi George
>
> So I tried starting to implement this, and the timing results look
> promising. However, currently I'm doing
>
> (*(u32*)ka > *(u32*)kb) - (*(u32*)ka < *(u32*)kb);
>
> in my do_cmp. Both existing invocations of the comparison function in
> sort.c compare its return value >= 0, which is always true if I just
> return the boolean (*(u32*)ka > *(u32*)kb). So it seems the algorithm
> would have to be changed to allow the cmp function to return a bool.
> Perhaps it's as simple as changing the two >= to >, but can I get you to
> check that that would be ok? (Quick testing seems to suggest so, but
> it's possible there are some corner cases where it would break.)
The only reason for >= vs. > is to do less work in the == case.
(I prefer to stay in the first backtracking loop, which does no
data motion, and therby reduce the number of swaps performed in
the second part of the backtracking.)
If you want to preserve the logic exactly, you can replace
"do_cmp(a, b, ...) >= 0" with "do_cmp(b, a, ...) <= 0" and then
the conversion is straightforward.