[racket] string-trim : an implementation & a question
I've lost track of what the function is supposed to be doing, but your
two functions don't agree on the input "a ", I don't think. I get
this:
(define (string-trim.1 s)
(regexp-replace #px"^\\s*([^\\s]*)\\s*$" s "\\1"))
(define (string-trim.2 s)
(define-syntax scan
(syntax-rules ()
((_ s start end step)
(for/first ((i (in-range start end step))
#:when (not (char-whitespace? (string-ref s i))))
i))))
(let* ((len (string-length s))
(last-index (sub1 len))
(start (or (scan s 0 len 1) 0))
(end (or (scan s last-index start -1) last-index)))
(substring s start (add1 end))))
> (string-trim.2 "a ")
"a "
> (string-trim.1 "a ")
"a"
On Sat, Apr 2, 2011 at 5:03 PM, Jon Zeppieri <zeppieri at gmail.com> wrote:
> Actually #rx seems to be much faster than #px (in this case, at any rate),
> but it's still slower:
>> (test)
> cpu time: 1162 real time: 1181 gc time: 40
> cpu time: 230 real time: 230 gc time: 0
>> (test)
> cpu time: 1184 real time: 1198 gc time: 38
> cpu time: 258 real time: 259 gc time: 21
>> (test)
> cpu time: 1220 real time: 1544 gc time: 40
> cpu time: 233 real time: 233 gc time: 0
>
> On Sat, Apr 2, 2011 at 5:56 PM, Jon Zeppieri <zeppieri at gmail.com> wrote:
>>
>> I was a bit surprised to find that the scanning-by-hand approach really is
>> significantly faster than using regexps.
>> Between these two functions:
>> (define (string-trim s)
>> (regexp-replace #px"^\\s*([^\\s]*)\\s*$" s "\\1"))
>> ... and ...
>> (define (string-trim s)
>> (define-syntax scan
>> (syntax-rules ()
>> ((_ s start end step)
>> (for/first ((i (in-range start end step))
>> #:when (not (char-whitespace? (string-ref s i))))
>> i))))
>>
>> (let* ((len (string-length s))
>> (last-index (sub1 len))
>> (start (or (scan s 0 len 1) 0))
>> (end (or (scan s last-index start -1) last-index)))
>> (substring s start (add1 end))))
>>
>> ... the latter is much faster. On 100000 iterations, using the test
>> string:
>> " \n \t foo bar
>> baz\n \r "
>> as input, I'm getting numbers like these (where the first time is for the
>> regexp function and the second is for the hand-scanning function):
>> > (test)
>> cpu time: 8003 real time: 8008 gc time: 0
>> cpu time: 256 real time: 257 gc time: 22
>> > (test)
>> cpu time: 8028 real time: 8025 gc time: 0
>> cpu time: 255 real time: 255 gc time: 22
>> > (test)
>> cpu time: 8418 real time: 8424 gc time: 0
>> cpu time: 260 real time: 260 gc time: 22
>> > (test)
>> cpu time: 8390 real time: 8401 gc time: 0
>> cpu time: 252 real time: 253 gc time: 20
>>
>>
>>
>> On Sat, Apr 2, 2011 at 5:20 PM, Richard Cleis <rcleis at mac.com> wrote:
>>>
>>> You can use an index to the string to find the location of your goal,
>>> then return the substring when you are done.
>>>
>>> rac
>>>
>>> On Apr 2, 2011, at 3:08 PM, Charles Hixson wrote:
>>>
>>> > This seems to be what I want the string-trim to do, but it seems that
>>> > all the string copying would be expensive. Is there a way to improve it by
>>> > avoiding the string copying?
>>> >
>>> > My original inclination was to use a while loop with a test for
>>> > non-whitespace, but that appears to not be something scheme supports.
>>> >
>>> > (define (string-trim s)
>>> > (let ( (l (string-length s) ) )
>>> > (cond
>>> > [ (= l 0) #f]
>>> > [ (char-whitespace? (string-ref s (- l 1) ) ) (string-trim
>>> > (substring s 0 (- l 1) ) ) ]
>>> > [else s]) ) )
>>> > _________________________________________________
>>> > For list-related administrative tasks:
>>> > http://lists.racket-lang.org/listinfo/users
>>>
>>> _________________________________________________
>>> For list-related administrative tasks:
>>> http://lists.racket-lang.org/listinfo/users
>>
>
>
> _________________________________________________
> For list-related administrative tasks:
> http://lists.racket-lang.org/listinfo/users
>