[racket-dev] `string-split'

From: Ryan Culpepper (ryan at cs.utah.edu)
Date: Thu Apr 19 09:42:16 EDT 2012

Instead of trying to design a 'string-split' that is both miraculously 
intuitive and profoundly flexible, why not design it like a Model-T and 
then write a guide/cookbook for how to use regexps to do all of the 
common cases that the extremely limited 'string-split' doesn't handle?

I suspect that writing such a guide will expose a few cases where common 
patterns can be turned into functions (similar to 'regexp-replace-quote').


On 04/19/2012 07:27 AM, Eli Barzilay wrote:
> Just now, Laurent wrote:
>>      1. Laurent: Does this make more sense?
>> Yes, this definitely makes more sense to me.  It would then treat
>> (string-split "aXXbXXXXy" "X") just like the " " case.
>> Although if you want to find the columns of a latex line like "x&&
>> y&  z" you will have the wrong result.  Maybe use an optional
>> argument to remove the empty strings? (not sure)
> (This complicates things...)
> First, I don't think that there's a need to make it able to do stuff
> like that -- either you go with regexps, or you use combinations like
>    (map string-trim (string-split "x&&  y&  z" "&"))
>>      4. Related to Q3: what does "xy" as that argument mean exactly?
>>        a. #rx"[xy]"
>>        b. #rx"[xy]+"
>>        c. #rx"xy"
>>        d. #rx"(?:xy)+"
>> Good question. d. would be the simplest case for newbies, but
>> b. might be more useful.  I think several other languages avoid this
>> issue by using only one character as the separator.
> The complication is that with " " or " \t" it seems that you'd want b,
> and with "&" you'd want c.  (Maybe even make"&" equivalent to
> #rx" *&  *" -- that looks like it's too much guessing.)
> And you're also making a point for:
>    e. Throw an error, must be a single-character string.
> BTW, this question is important because it affects other functions, so
> I'd like to resolve it before doing anything.

Posted on the dev mailing list.