[racket-dev] `regexp-explode' etc + poll
I've implemented a new `regexp-explode' function. It accepts the same
arguments as `regexp-match*' and `regexp-split', but with two
additional keyword arguments:
* #:select-match
If this is #t (the default) then the result includes the lists of
results from the sub-matches. It can also be #f to not include
them, and it can be a "selector function" that chooses a specific
one (eg, `car' etc) or return a different list of matches (eg,
`cdr').
* #:select-gap
This is just a boolean flag -- if it's #t (the default), the
strings between the matches are returned as well -- interleaved
with the (lists of) matches, otherwise they're omitted.
So by default, you get the information that `regexp-split' returns,
interleaved with the full results of matching. Examples:
-> (regexp-explode #rx"[^0-9]([^0-9])?" "0+1.*2")
'("0" ("+" #f) "1" (".*" "*") "2")
-> (regexp-explode #rx"[^0-9]([^0-9])?" "0+1.*2"
#:select-match car #:select-gap #f)
'("+" ".*")
-> (regexp-explode #rx"[^0-9]([^0-9])?" "0+1.*2"
#:select-match cadr)
'("0" #f "1" "*" "2")
*** Minor poll: I'm not too happy with that `select-gap' name. Any
suggestions for a better name?
But the obvious next function to implement,
`regexp-explode-positions', complicated things a little. The thing is
that there's no point in having it have the same interface -- the gaps
are useless there since they're easily inferred from the matches (as
seen by the lack of a `regexp-split-positions' function). So, a
possible alternative that I thought about is to add a `#:select-match'
keyword to `regexp-match-positions*' instead, so it can return the
list of position matches in a similar way. However, that would lead
to another problem: it would be bad to have a keyword argument only
for `regexp-match-positions*' which is not accepted by
`regexp-match*'. So a solution to that is to add it to
`regexp-match*' too, but then there's little point in
`regexp-explode'...
So the options that I see are:
1. Drop the new `regexp-explode' name, and instead have this
functionality folded into `regexp-match*', which will get the two
new keywords with a default of #f for `#:select-gap', and `car' for
`#:select-match'. Similarly Add `#:select-match' to
`regexp-match-positions*', but not `#:selet-gap'.
1a. Minor variation: insist on uniformity, and include a
`#:select-gap' keyword for `regexp-match-positions*' too.
2. Same as #1, but also have `regexp-explode', which is now the same
as `regexp-match*' but with different defaults for the two
keywords.
2a. Same variation for #1a.
3. Do not extend the interface of existing functions -- have only the
new `regexp-explode' have the added functionality. For the
positions version, add a `regexp-explode-positions', without a
`#:select-gap' keyword. The possible advantage here is that the
(already complicated) output type of `regexp-match*' stays the
same, and `regexp-explode' gets the much more complicated one.
3a. Same as #3, but with `#:select-gap' for
`regexp-explode-positions'.
I'm now leaning towards #1. Any votes for other options, or maybe
something different?
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!