[racket] string-strip
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 28-12-11 19:27, Neil Van Dyke wrote:
> Marijn wrote at 12/28/2011 12:00 PM:
>> I don't think my use of this code is very performance, but I
>> couldn't help myself, so I looked into making it faster
>
> This is the best spirit. :)
>
>> What I found was that it is much slower to treat a string as a
>> port and then read-char from that then it is to directly index
>> the string.
>
> That string input ports are often noticeably slower than string
> indexing is one of the banes of my existence. Most reading and
> parsing operations you implement, you want to work on both ports
> and strings. But, if you first write a procedure that works on a
> port, and then write a wrapper procedure that works on a string (by
> doing an "open-input-string" and calling your procedure that works
> on ports), the string one can be noticeably slower than if you'd
> handwritten the string one. But having to write two separate
> procedures has big development cost, and I always just take the
> performance hit on strings instead, or write a string procedure and
> then not have a port procedure when I need it later. One approach
> that might help is to design a macro that lets people define
> processing on strings and ports, and expands to produce two closure
> definitions -- one that works on ports, and one on strings, and
> avoids a lot of port-related overhead in the string one.
Matthew, any comments on this? Is there a fundamental reason that
treating a string as a port is so much slower than direct indexing or
is there something that can be done about it? Or should we look into
automatically duplicating code with macros?
Marijn
>> In the end I was able to construct code that is another factor
>> 5.5 faster than your version:
>>
>
> Marijn, that's a great implementation.
>
> And it's encouraging to see that forgoing regexps and
> "string->number", and doing a character-by-character DFA, is faster
> in this case. It's pretty common in interpreted languages to use
> regexps for performance reasons; nice when we see that pure Racket
> code can be faster.
>
> One tangential comment: I don't think it's significant in this
> situation, but I believe that "case" is currently slower than,
> say, people coming from a C background might think. So, sometimes,
> if one really wants to micro-optimize, one sometimes might be
> better off doing, say strategic "if"s and arithmetic, instead of
> "case".
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk78LIsACgkQp/VmCx0OL2x7hQCfb77YNdgro1gKb3hhUxYQ+za7
hfAAnRwlJ2qdTOCZbNuyZvFZw34oDebI
=AyrQ
-----END PGP SIGNATURE-----