[racket-dev] Potential search improvement
The search still doesn't find words in function descriptions. For
example, http://pre.racket-lang.org/docs/html/search/index.html?q=sine
returns no results. This is especially frustrating since the very
first exercise in HTDP 1e is to use the search to find out whether
DrRacket has a sine function.
Justin
On Tue, May 29, 2012 at 7:17 AM, Eli Barzilay <eli at barzilay.org> wrote:
> I have made a possibly useful improvement to the JS search code.
> It's not pushed, yet, but I dropped the revised JS code on the
> pre-built pages so you can try it out here:
>
> http://pre.racket-lang.org/docs/html/search/
>
> and compare searches with the usual page:
>
> http://docs.racket-lang.org/search/
>
> I'd appreciate people playing with it to find about potential problems
> with the ordering and possibly with different browsers.
>
>
> ** More about the change (especially if you want to try to improve
> things):
>
> This is not real ranking, but it should give better results overall.
> The thing is that the search assigns a small integer "score" for each
> term, where the scores are (roughly)
>
> 0 no match,
> 1 match-all-subword-parts,
> 2 contains a match,
> 3 matches a prefix,
> 4 exact match.
>
> The thing is that they used to be lumped to 2 groups with exact
> matches first. Now I made each of these be in its own group, so
> there's a little more order. To see an example that works nicely now
> try "splay".
>
> This doesn't solve all problems... To see problematic things (that
> Neil has complained about in the past) try:
>
> * "port" (gives precedence for exact matches, but the reference
> entries are better; better now with the chapters appearing right
> after the exact binding matches).
>
> * "fold" (same problem, where it could be argued that for most
> people "foldl" from `racket/base' is better than "fold" from the
> DMdA languages and `srfi/1').
>
> Some of the problem comes from having no preferences for the results.
> Such preferences are not hard to implement, but they connect two
> unrelated pieces of code (the score assignments in the JS search, and
> the bonus for each manual) and it can quickly get into sticky
> questions.
>
> Another aspect of the problem is that there's N search terms, not just
> one. Currently, the score for each is combined with a `min'; a `max'
> tends to be worse. Ideally, it would use an average, but that would
> require to actually sort the results.
>
> --
> ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
> http://barzilay.org/ Maze is Life!
> _________________________
> Racket Developers list:
> http://lists.racket-lang.org/dev