[racket] reader.html: documentation for complex number reading

From: Tim Brown (tim at timb.net)
Date: Fri Mar 29 09:23:17 EDT 2013

On 29/03/2013 03:03, Matthew Flatt wrote:
> Let's fix the docs. My current attempt is

The grammar in the docs may not be wrong. I re-read it, in conjunction
with the preamble on "Reading Numbers". Although, strictly, it is
correct and covers all cases, it is (IMHO) quite hard to read. I wonder
whether that original description is to cover for impracticalities in
the BNF.

 From 
http://docs.racket-lang.org/reference/reader.html?q=number&q=reader#%28part._parse-number%29 
:
> As the non-terminal names suggest, a number that has no exactness
> specifier and matches only ‹inexact-numbern› is normally parsed as
> an inexact number, otherwise it is parsed as an exact number. If the
> read-decimal-as-inexact parameter is set to #f, then all numbers
> without an exactness specifier are instead parsed as exact.

So I don't know if your grammar changes should either take into account
the paragraph above, or whether a more comprehensive change would be in
order.

>   ‹exact-rationaln›    ::=  [‹sign›] ‹unsigned-rationaln›
>   ‹unsigned-rationaln› ::= ‹unsigned-integern›
>                          | ‹unsigned-integern› / ‹unsigned-integern›
>   ‹exact-integern›     ::= [‹sign›] ‹unsigned-integern›
>   ‹unsigned-integern›  ::= ‹digitn›+
>
> Does that look right?

Looks right to me.

> FWIW, there are number-parsing regexp constructions in
>   collects/r6rs/private/readtable.rkt (see `rx:number')
> and
>   collects/syntax-color/racket-lexer.rkt (see `make-num')
> The first, as the path suggests, is R6RS instead of Racket.

My regexps were also failing because of issues of greediness etc.
in PHP's/PCRE's matching processes. The BNF above doesn't generate
the longest matchable strings first; so I've had to tweak it for my
purposes. I now have something that matches racket numbers. Maybe not
all of them, but at least the e.g.s in the reader chapter :-)

> The latter is known to mishandle cases like "#x1E+2" (because "E" is
> not an exponent marker for hexadecimal), but I'll push a fix for that.
> And I'll fix it for extflonums, too.

The reader's definition (above) explicitly states that E is only an
exponent marker for bases 2, 8 and 10 -- would you not be changing the
meaning of #x1E+2 from 1 E 2 to... er... whatever 16^2 is? Or does the
lack of delimiters in #x1E+2 mean that it's just an erroneous string of
characters?

Tim

Posted on the users mailing list.