[racket-dev] url->string: what do we do?

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Fri Mar 30 11:38:48 EDT 2012

On Fri, Mar 30, 2012 at 10:34 AM, Eli Barzilay <eli at barzilay.org> wrote:
> Just now, Robby Findler wrote:
>> On Fri, Mar 30, 2012 at 10:30 AM, Eli Barzilay <eli at barzilay.org> wrote:
>> > I'm fine with that (and with the push that does it), as long as
>> > it's clear that this would change if it grows to be more than just
>> > a regexp match.  (Hence my question whether the second regexp is
>> > needed -- if it's just the first one, then it looks like it
>> > already allows other schemas, to be parsed further in the code.)
>>
>> I'm not quite following this paragraph, but I think we're in
>> agreement. I've already pushed the change and my inference is that
>> you'd be happy with it (I put some timing numbers in the commit.).
>
> The question is whether there is any damage to what it can do now if
> you change the contract to a more restrictive but simpler one:
>
>   (string->url (-> #rx"^[a-zA-Z][a-zA-Z0-9+.-]*:" url?))
>
> ?

Well, that regexp would add more errors than were there before. Even doing this:

  #rx"(:^[a-zA-Z][a-zA-Z0-9+.-]*:)?"

(ie "if there is a colon, then insist the schema match that regexp")
is adding more errors, since the current code will sometimes take all
that stuff and stick it into some later part of the url.

I don't object to making such changes, but I was really focused on
just moving the errors into the contract and not on fixing
string->url, so I haven't thought through the ramifications or tried
any experiments along those lines.

Robby


Posted on the dev mailing list.