[racket] reader.html: documentation for complex number reading
On 29/03/2013 03:03, Matthew Flatt wrote:
> Let's fix the docs. My current attempt is
The grammar in the docs may not be wrong. I re-read it, in conjunction
with the preamble on "Reading Numbers". Although, strictly, it is
correct and covers all cases, it is (IMHO) quite hard to read. I wonder
whether that original description is to cover for impracticalities in
the BNF.
From
http://docs.racket-lang.org/reference/reader.html?q=number&q=reader#%28part._parse-number%29
:
> As the non-terminal names suggest, a number that has no exactness
> specifier and matches only ‹inexact-numbern› is normally parsed as
> an inexact number, otherwise it is parsed as an exact number. If the
> read-decimal-as-inexact parameter is set to #f, then all numbers
> without an exactness specifier are instead parsed as exact.
So I don't know if your grammar changes should either take into account
the paragraph above, or whether a more comprehensive change would be in
order.
> ‹exact-rationaln› ::= [‹sign›] ‹unsigned-rationaln›
> ‹unsigned-rationaln› ::= ‹unsigned-integern›
> | ‹unsigned-integern› / ‹unsigned-integern›
> ‹exact-integern› ::= [‹sign›] ‹unsigned-integern›
> ‹unsigned-integern› ::= ‹digitn›+
>
> Does that look right?
Looks right to me.
> FWIW, there are number-parsing regexp constructions in
> collects/r6rs/private/readtable.rkt (see `rx:number')
> and
> collects/syntax-color/racket-lexer.rkt (see `make-num')
> The first, as the path suggests, is R6RS instead of Racket.
My regexps were also failing because of issues of greediness etc.
in PHP's/PCRE's matching processes. The BNF above doesn't generate
the longest matchable strings first; so I've had to tweak it for my
purposes. I now have something that matches racket numbers. Maybe not
all of them, but at least the e.g.s in the reader chapter :-)
> The latter is known to mishandle cases like "#x1E+2" (because "E" is
> not an exponent marker for hexadecimal), but I'll push a fix for that.
> And I'll fix it for extflonums, too.
The reader's definition (above) explicitly states that E is only an
exponent marker for bases 2, 8 and 10 -- would you not be changing the
meaning of #x1E+2 from 1 E 2 to... er... whatever 16^2 is? Or does the
lack of delimiters in #x1E+2 mean that it's just an erroneous string of
characters?
Tim