[racket] lexer priority
I try to write a lexer and parser, but I cannot figure out how to set
priority to lexer's tokens. My simplified lexer (shown below) has only 2
tokens BLOCK, and COMMENT. BLOCK is in fact a subset of COMMENT. BLOCK
appears first in the lexer, but when I parse something that matches BLOCK,
it always matches to COMMENT instead. Below is my program. In this
particular example, I expect to get a BLOCK token, but I get COMMENT token
instead. If I comment out (line-comment (token-COMMENT lexeme)) in the
lexer, I then get the BLOCK token.
Can anyone tell me how to work around this issue? I can only find this in
the documentation
"When multiple patterns match, a lexer will choose the longest match,
breaking ties in favor of the rule appearing first."
#lang racket
(require parser-tools/lex
(prefix-in re- parser-tools/lex-sre)
(define-tokens a (BLOCK COMMENT))
(define-empty-tokens b (EOF))
(define-lex-trans number
(syntax-rules ()
((_ digit)
(re-: (uinteger digit)
(re-? (re-: "." (re-? (uinteger digit))))))))
(define-lex-trans uinteger
(syntax-rules ()
((_ digit) (re-+ digit))))
(block-comment (re-: "; BB#" number10 ":"))
(line-comment (re-: ";" (re-* (char-complement #\newline)) #\newline))
(digit10 (char-range "0" "9"))
(number10 (number digit10)))
(define my-lexer
(block-comment (token-BLOCK lexeme))
(line-comment (token-COMMENT lexeme))
(whitespace (position-token-token (my-lexer input-port)))
((eof) (token-EOF))))
(define my-parser
(start code)
(end EOF)
(lambda (tok-ok? tok-name tok-value start-pos end-pos)
(raise-syntax-error 'parser
(format "syntax error at '~a' in src l:~a c:~a"
(position-line start-pos)
(position-col start-pos)))))
(tokens a b)
(unit ((BLOCK) $1)
((COMMENT) $1))
(code ((unit) (list $1))
((unit code) (cons $1 $2))))))
(define (lex-this lexer input)
(lambda ()
(let ([token (lexer input)])
(pretty-display token)
(define (ast-from-string s)
(let ((input (open-input-string s)))
(ast input)))
(define (ast input)
(my-parser (lex-this my-lexer input)))
(ast-from-string "
; BB#0:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20140723/e9727da8/attachment.html>