[racket] How write regexp for lexer

From: Бомбин Валентин (wwall at yandex.ru)
Date: Wed Sep 25 07:50:27 EDT 2013

Thansk for answer, but it can't help. Simple example  latter  - 

#lang racket/base

(require racket/string
         parser-tools/lex
         (prefix-in : parser-tools/lex-sre))

(define-tokens my-tokens (BCOMMENT COMMENT OT))

(define-empty-tokens my-empty-tokens (EOF NEWLINE))

(define-lex-abbrevs
  [%whitespace (:or #\tab #\space #\vtab)]
  [%newline (:or #\newline (:seq #\return #\newline))]
  [%other-token  "other-token"]
  [%bComment (:seq %newline "//" (:* (:& (char-complement #\return)
                                         (char-complement #\newline))))]
  [%comment (:seq "//" (:* (:& (char-complement #\return)
                               (char-complement #\newline))))])

(define my-lexer
  (lexer-src-pos
   [(:+ %whitespace)
    (return-without-pos (my-lexer input-port))]
   [%newline
    (token-NEWLINE)]
   [%bComment
    (token-BCOMMENT lexeme)]       
   [%comment
    (token-COMMENT lexeme)]
   [%other-token (token-OT lexeme)]
   [(eof)
    'EOF]))

(define p (open-input-string "//must be full line comment? but don't recognize
other-token // end to line comment (it's ok)
//full line comment (it's right)
  // must be full comment, but %whitespace eat space :("))
(port-count-lines! p)

(let loop ([result null])
  (define tok (my-lexer p))
  (if ((position-token-token tok) . eq? . 'EOF)
      (reverse (cons tok result))
      (loop (cons tok result))))


 output  - 


(list
 (position-token (token 'COMMENT "//must be full line comment? but don't recognize") (position 1 1 0) (position 49 1 48))
 (position-token 'NEWLINE (position 49 1 48) (position 50 2 0))
 (position-token (token 'OT "other-token") (position 50 2 0) (position 61 2 11))
 (position-token (token 'COMMENT "// end to line comment (it's ok)") (position 62 2 12) (position 94 2 44))
 (position-token (token 'BCOMMENT "\n//full line comment (it's right)") (position 94 2 44) (position 127 3 32))
 (position-token 'NEWLINE (position 127 3 32) (position 128 4 0))
 (position-token (token 'COMMENT "// must be full comment, but %whitespace eat space :(") (position 130 4 2) (position 183 4 55))
 (position-token 'EOF (position 183 4 55) (position 183 4 55)))
> 
I want to different two kind types of comment - first - is comment which begin from start line (first symbol of line is '/' or whitespaces and '/' ), second - is comment which start with '/' but beffore was reading any other symbol (except whitespaces)
Javascript regexp for it is "^//.*\n", but i don't know how write '^' in lexer in drracket

25.09.2013, 15:10, "Evgeny Odegov" <oev-racket at sibmail.com>:
> Валентин,
> maybe this short example could help
>
> http://pastebin.com/ncZpH49E
>
>>  Hello. I need write rule for lexer for recognize comment which begin from
>>  first column of line.
>>  I have next regexp for comment (:seq "//" (:* (:~ CR LF))), aftrer add
>>  cr/lf to begin of expression i write (:seq  LineTerminator (:* (:or
>>  #\space TAB FF)) "//" (:* (:~ CR LF))), but this regexp -first eat
>>  LineTerninator, second - don't recognize comment if it first in file
>>  Help me, please, write this rule
>>  ____________________
>>    Racket Users list:
>>    http://lists.racket-lang.org/users

Posted on the users mailing list.