[racket-dev] `string-split'
[Meta-note: I'm not just flatly object to these, just trying to
clarify the exact behavior and the possible effects on other
functions.]
10 minutes ago, Laurent wrote:
>
>
> (define (string-split str [sep #px"\\s+"])
> (remove* '("") (regexp-split sep str)))
>
> Nearly, I meant something more like this:
>
> (define (string-split str [splitter " "])
> (regexp-split (regexp-quote splitter) str))
>
> No regexp from the user POV, and much easier to use with little
> knowledge.
That doesn't seem right -- with this you get
-> (string-split " st ring")
'("" "st" "" "ring")
which is why I think that the above is a better definition in terms of
newbie-ness.
10 minutes ago, Matthew Flatt wrote:
> I agree with this: we should add `string-split', the one-argument case
> should be as Eli wrote, and the two-argument case should be as Laurent
> wrote. (Probably the optional second argument should be string-or-#f,
> where #f means to use #px"\\s+".)
Continuing with this line, it seems that a better definition is as
follows:
(define (string-split str [sep " "])
(remove* '("") (regexp-split (regexp-quote (or sep " ")) str)))
Except that the full definition could be a bit more efficient.
Three questions:
1. Laurent: Does this make more sense?
2. Matthew: Is there any reason to make the #f-as-default part of the
interface? (Even with the new reply I don't see a necessity for
this -- if the target is newbies, then I think that keeping it as a
string is simpler...)
3. There's also the point of how this optional argument plays with
other functions in `racket/string'. If it works as above, then
`string-trim' and `string-normalize-spaces' should change
accordingly so they take the same kind of input simplified
"regexp".
4. Related to Q3: what does "xy" as that argument mean exactly?
a. #rx"[xy]"
b. #rx"[xy]+"
c. #rx"xy"
d. #rx"(?:xy)+"
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!