[racket-dev] Potential search improvement

From: Sam Tobin-Hochstadt (samth at ccs.neu.edu)
Date: Tue May 29 08:59:50 EDT 2012

On Tue, May 29, 2012 at 7:33 AM, Eli Barzilay <eli at barzilay.org> wrote:
> Just now, Sam Tobin-Hochstadt wrote:
>> On Tue, May 29, 2012 at 7:17 AM, Eli Barzilay <eli at barzilay.org> wrote:
>> >
>> > ** More about the change (especially if you want to try to improve
>> >   things):
>> >
>> > This is not real ranking, but it should give better results overall.
>> > The thing is that the search assigns a small integer "score" for each
>> > term, where the scores are (roughly)
>> >
>> >  0 no match,
>> >  1 match-all-subword-parts,
>> >  2 contains a match,
>> >  3 matches a prefix,
>> >  4 exact match.
>>
>> I think you probably want to rank/divide '1' here based on how much of
>> the identifier is matched by the search.  For example, if you search
>> for 'current-sep-line', you probably want 'current-line-sep' first,
>> but currently you get 'current-alist-line-sep' first.
>
> Like I said: [...] but that would require to actually sort the
> results.
>
> (The thing is that now it does something like
>
>  matches[score].push(entry)
>
> and then it concatenates all of the matches arrays.  To have random
> numbers, it would need to put everything in one array and then sort
> it.  That can currently get to ~20k things to sort and adjust for
> additional entries that get added on each release, planet packages,
> etc.)

Getting away from the discussion on sorting speed, I don't think my
suggestion even requires sorting: just add a 1.5 for
match-all-subword-parts-to-whole-id.

-- 
sam th
samth at ccs.neu.edu


Posted on the dev mailing list.