[racket-dev] Comparison functions and the `data' collection
This response leaves us with the impression that we whimsically challenged established conventions in the general and the Lisp-specific PL family. In addition it leaves us with inconsistencies across our libraries, which I am coming to consider more and more as a serious problem.
Now -- I would be the last one to defend "established tradition" over "getting things right" but at a minimum, the reasoning of why we challenge tradition should be noted.
-- Matthias
On Jun 11, 2012, at 5:40 PM, Ryan Culpepper wrote:
> On 06/11/2012 02:36 PM, Eli Barzilay wrote:
>> Yesterday, Danny Yoo wrote:
>>>
>>> It's a little unfortunate that there's a slight impedance mismatch
>>> between what datum-order provides and what sort expects; the
>>> my-less-than function in the example adapts the output of
>>> datum-order so it can be used with sort.
>>
>> Thanks for pointing it out (I didn't know about it). When I looked
>> into this, I saw that in addition to `data/order' there is also an
>> issue with `data/heap'. It certainly does look like a problem worth
>> addressing.
>>
>> The thing is that are two common ways to specify comparison functions:
>> (a) three-valued comparison functions returning -1/0/+1, (b) a boolan
>> strictly-smaller-than predicate. The first is common in several
>> mainsteam languages and the second is traditionally common in lisps
>> (which leads to its use in the sort function).
>>
>> The issues are described very thoroughly in srfi-67 -- specifically,
>> see the first two subsections in section 6. I highly recommend
>> reading that for this discussion. They decide to go with option (a),
>> with an explanation for the choice of a three-valued function and for
>> their choice of -1/0/+1 values.
>
> I don't remember if I discovered srfi-67 before or after I added data/order. In any case, I disagree with its rationale for -1/0/+1: I dislike the idea of performing arithmetic on orderings. I also prefer the pattern
>
> (case (compare a b)
> [(<) ...]
> [(=) ...]
> [(>) ...])
>
> to the corresponding version with numbers.
>
> OTOH, data/order is used primarily by the ordered dictionary types (data/splay-tree, data/skip-list), so it's probably missing operations and conveniences for tasks like sorting a list.
>
>> Very briefly, each of the two options has an advantage: the first
>> makes some cases faster since a single function call returns the
>> relation between the two whereas the second requires two calls to
>> distinguish equivalence from bigger-than. The second has an advantage
>> of being more convenient. (Note that with `#:cache-keys?', there's
>> probably some cases where it's easy to speed things up using some
>> preprocessing, but probably not in all cases.)
>>
>> So I see three problems here:
>>
>> 1. There might be a need for a variant of `sort' that is based on a
>> three-valued predicate. That's certainly doable, but I don't
>> remember anyone needing this in practice yet. Furthermore, doing
>> this kind of thing in such a core-ish function seems questionable
>> without more support for 3-valued comparisons -- that is, without
>> such actual functions that compare numbers, strings etc. So
>> perhaps this is better done in some additional library.
>>
>> 2. The `data/order' interface has several problems that I think is
>> best to resolve.
>>
>> - Regadless of the above, it seems like a good idea to extend the
>> interface with a simple boolean predicate. Maybe something like
>> `datum<?' and possibly others. This would address the issue that
>> Danny raised above.
>
> From an order you can get the less-than and equality predicates:
>
> (order-<? datum-order)
> (order-=? datum-order)
>
>> - I also think that it's not a great choice to require two
>> functions for creating orders when in many cases a single boolean
>> valued function can work fine. Providing a way to create an
>> order with a single function would make it easy to translate a
>> boolean predicate into an order. Is there any reason to require
>> an explicit equality?
>
> No, I can add 'order-from-<?' and 'order-from-<=?'. As you point out below, it would be difficult to add these options the 'order' function itself, though.
>
>> I don't know what would be a good way to do that. AFAICT, this
>> is not something that can fit the current interface because the
>> equality is the first argument -- if it was the second, then it
>> could be made optional. But in addition to that I see at least
>> one place (`order') where a single input function is assumed to
>> be a comparator. There could be a function that translates a
>> boolean predicate to a comparator but IMO it's very confusing as
>> is, that you can't just hand it over directly.
>>
>> - Another problem is the choice of '< '= and'> instead of the much
>> more popular -1 0 +1. This could be hand-waved away if it wasn't
>> in actual use in our neighborhood, but there is already srfi-67,
>> and I think that `soegaard/galore' on planet uses that too, and
>> generally it seems like a bad idea to go with something different
>> that leads to a constant need for translating values back and
>> forth. This comes in addition to the actual reasons for using
>> the integer values which are listed in the srfi (like using the
>> result as an index offset, getting a reversed order with a simple
>> negation, etc).
>
> See above.
>
>> 3. When I looked for mentions of sorting in the data collection, I
>> noticed `data/heap': this code *does* use a binary comparison, but
>> it's expecting a `<=?' instead of a `<?'. This makes it
>> incompatible with both (the current) `data/order' and with `sort'.
>> Is there a particular reason for this odd choice? (The best guess
>> I have is that this is a translation of some internal code?)
>
> IIRC, data/heap came before data/order and hasn't been updated.
>
>> In addition to that, there is the order of inputs to `heap-sort!'
>> which contradicts the order of inputs to `sort', which again, seems
>> like an inconsistent interface.
>
> Yes, that just seems like a mistake.
>
> Ryan
> _________________________
> Racket Developers list:
> http://lists.racket-lang.org/dev