[racket-dev] Potential search improvement

From: Justin Zamora (justin at zamora.com)
Date: Tue May 29 10:18:55 EDT 2012

The search still doesn't find words in function descriptions.  For
example, http://pre.racket-lang.org/docs/html/search/index.html?q=sine
returns no results.  This is  especially frustrating since the very
first exercise in HTDP 1e is to use the search to find out whether
DrRacket has a sine function.

Justin

On Tue, May 29, 2012 at 7:17 AM, Eli Barzilay <eli at barzilay.org> wrote:
> I have made a possibly useful improvement to the JS search code.
> It's not pushed, yet, but I dropped the revised JS code on the
> pre-built pages so you can try it out here:
>
>  http://pre.racket-lang.org/docs/html/search/
>
> and compare searches with the usual page:
>
>  http://docs.racket-lang.org/search/
>
> I'd appreciate people playing with it to find about potential problems
> with the ordering and possibly with different browsers.
>
>
> ** More about the change (especially if you want to try to improve
>   things):
>
> This is not real ranking, but it should give better results overall.
> The thing is that the search assigns a small integer "score" for each
> term, where the scores are (roughly)
>
>  0 no match,
>  1 match-all-subword-parts,
>  2 contains a match,
>  3 matches a prefix,
>  4 exact match.
>
> The thing is that they used to be lumped to 2 groups with exact
> matches first.  Now I made each of these be in its own group, so
> there's a little more order.  To see an example that works nicely now
> try "splay".
>
> This doesn't solve all problems...  To see problematic things (that
> Neil has complained about in the past) try:
>
>  * "port" (gives precedence for exact matches, but the reference
>    entries are better; better now with the chapters appearing right
>    after the exact binding matches).
>
>  * "fold" (same problem, where it could be argued that for most
>    people "foldl" from `racket/base' is better than "fold" from the
>    DMdA languages and `srfi/1').
>
> Some of the problem comes from having no preferences for the results.
> Such preferences are not hard to implement, but they connect two
> unrelated pieces of code (the score assignments in the JS search, and
> the bonus for each manual) and it can quickly get into sticky
> questions.
>
> Another aspect of the problem is that there's N search terms, not just
> one.  Currently, the score for each is combined with a `min'; a `max'
> tends to be worse.  Ideally, it would use an average, but that would
> require to actually sort the results.
>
> --
>          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>                    http://barzilay.org/                   Maze is Life!
> _________________________
>  Racket Developers list:
>  http://lists.racket-lang.org/dev


Posted on the dev mailing list.