[racket] Question about parser-tools/lex
On Thu, Oct 18, 2012 at 01:56:20PM -0600, Danny Yoo wrote:
> > ;; Test 3
> > (check-exn exn:fail? (lambda () (collect-tokens "4a")))
> > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >
> > I thought that, given the way the NUM and ID tokens are defined (resp. only digits, only letters), the third test should pass...it does not.
>
>
> Ok, good. So we want test 3 to fail because there should be some kind
> of delimiter between the number token and the rest. One direct way we
> can do this is peek into the port and see if it has a delimiter
> immediately following the tokenized content. We can amend your
> tokenizer to:
>
>
> ;;;;;;;;;;;;;;;;;;;;
> ;; Add syntax/readerr to the list of requires:
> (require syntax/readerr)
> ;; ...
>
>
> (define sample-lexer
> (lexer
> [(eof) 'EOF]
> [whitespace (sample-lexer input-port)]
> [(:+ alphabetic)
> (begin
> (assert-delimiter-follows! lexeme input-port)
> (token-ID (string->symbol lexeme)))]
> [(:+ numeric)
> (begin
> (assert-delimiter-follows! lexeme input-port)
> (token-NUM (string->number lexeme)))]))
>
>
> ;; check that there's a whitespace or eof coming up in the input-port.
> (define (assert-delimiter-follows! lexeme ip)
> (define next-char (peek-char ip))
> (unless (or (eof-object? next-char)
> (char-whitespace? next-char))
> (define-values (line column position) (port-next-location ip))
> (raise-read-error (format "expected delimiter after ~e, but I see
> ~e" lexeme (string next-char))
> (object-name ip)
> line column position 1)))
> ;;;;;;;;;;;;;;;;;;;;
>
>
> There may be a more direct way to express this within the
> parser-tools/lex library. But since we have general power in each of
> the lexer actions, we can do this too.
>
> Hope this helps!
Thanks a lot ! This works as expected.
I feel ashamed, I did not even think about doing it this way.
I was obstinately trying to express this using the lexer syntax, stupid me.
Thanks again for your time and your quick answers.
Regards,
Philippe