[plt-scheme] Strange benchmark for string=?

From: Danny Yoo (dyoo at hkn.eecs.berkeley.edu)
Date: Sun Jul 3 18:29:07 EDT 2005

On Sun, 3 Jul 2005, [ISO-8859-1] Jens Axel Søgaard wrote:

>  > (let ((s1 (make-string 50000 #\a))
>          (s2 (make-string 50001 #\a)))
>      (define (my-string=? s1 s2)
>        (and (= (string-length s1)
>                (string-length s2))
>             (string=? s1 s2)))
>      (time (do ((i 0 (+ i 1)))
>              ((= i 10000))
>              (my-string=? s1 s2))))
> cpu time: 40 real time: 90 gc time: 0
> Is this the fault of unicode or just an oversight in string=? ?

Hi Jens,

I don't think this ahs much to do with Unicode.  I'm guessing that the
implementation of string-equals? appears just to use string comparison, so
that there isn't much of a performance difference, even the strings are
obviously unequal.

According to src/string.c from the SVN repository:

GEN_STRING_COMP(string_eq, "string=?", mz_char_strcmp, ==, 0)
GEN_STRING_COMP(string_lt, "string<?", mz_char_strcmp, <, 0)
GEN_STRING_COMP(string_gt, "string>?", mz_char_strcmp, >, 0)
GEN_STRING_COMP(string_lt_eq, "string<=?", mz_char_strcmp, <=, 0)
GEN_STRING_COMP(string_gt_eq, "string>=?", mz_char_strcmp, >=, 0)

the code block here suggests that string=? just uses a straighforward
mz_char_strcmp, so doesn't appear to be doing any special-case
optimization for string equality.

I don't think the string-length optimization would be hard to code, but
would make the code just slightly more complex and there's be a little bit
of code duplication of most of the stuff in the GEN_STRING_COMP C macro.

Best of wishes to you!

Posted on the users mailing list.