[racket] Beginner's question on elementary textual replacement...

From: Rüdiger Asche (rac at ruediger-asche.de)
Date: Fri Mar 9 12:48:25 EST 2012

ok, I did study your code fragments (for which I am very grateful, thanks!)

Actually, control flow in my opinion is just a minor piece of the puzzle - I 
like your idea, but in terms of readability, it doesn't really make all the 
difference in the world whether one represents each possible state 
transition as a separate function (as you do) or implements one state 
transition function that takes a state as a parameter and then dispatches on 
the state (as I sketched out in my original code fragment). In fact, in the 
"real" fsm my single state transition function takes more than one 
parameter - one is the current state, and another one the next character to 
process, so the tail recursive call with the next state and the next 
character is very similar to your call of the next state transition 
function, only expressed inside out.

Thus in terms of control flow, the two approaches are really not that far 
apart, and as you point out, it's in part a matter of taste which "final" 
form one prefers - I guess using your outline of define-fsm one could 
probably also find an expansion that translates into my large transition 
function.

Anways. Interesting for me is this here:

(define-syntax (define-fsm stx)
  (syntax/parse stx
    [(_ Alphabet States Initial Final Transitions) …]))   <===

where your instantiation looks like this:

(define-fsm ab*c
  ;; alphabet, with mapping to external events
  ([A 1][B 2][C 3])                       <===
  ;; states:
  (start seenA seen1B seenC)
  (start)
  (seenC)
  (on A: start -> seenA)            <===
  (on B: seenA -> seenB)
  (on C: seenB -> seenC))
...

which expands into

...
    (if (= letter 1) seenA (error "fsm1")))     <===
...

so somewhere along the lines there *is* some translation that successively 
replaces the 'A' in (on A...) or its previous translation  with 1, even 
though there probably are one or more two intermediate steps in it. Of 
course, here the constants used are not state identifiers but elements of 
the alphabet but the same arguments about their representation holds... in 
my case the letters to be processed are bytes read in over a communications 
stream, and they have a symbolic meaning, so instead of

;; alphabet, with mapping to external events
  ([A 1][B 2][C 3])

I would write

;; alphabet, with mapping to external events
  ([SYN 1][SOH 2][EOH 3])

and be able to use

(if (= letter SYN) seenA (error "fsm1")))

in my manually writte state transition function to make the protocol 
explicit.

THAT's the piece I'm looking for; what makes the 'A' eventually go away and 
let each instance of it be replaced with 1? I don't care a whole lot about 
representing the entire fsm in one syntactically neat chunk; the define form 
you sketch out as the final expansion suits me perfectly well - reason being 
that the fsms I need to implement have a considerable number of side effects 
to be performed as the state transition functions are invoked, and there is 
more to state transitions such as timeout policies, so I'd be perfectly 
happy with an intermediate form of the fsm. Only thing is that I need to be 
able to work with symbolic entities instead of constants, and then we're 
back to the initial question...

I think I should really stop thanking all of you for the amount of thought 
and enthusiasm you put into these debates, so I'll just leave it at that 
this is never taken for granted and always appreciated.

----- Original Message ----- 
From: "Matthias Felleisen" <matthias at ccs.neu.edu>
To: "Rüdiger Asche" <rac at ruediger-asche.de>
Cc: "Stephen Bloch" <bloch at adelphi.edu>; "users Users" 
<users at racket-lang.org>
Sent: Friday, March 09, 2012 5:28 PM
Subject: Re: [racket] eginner's question on elementary textual 
replacement...



On Mar 9, 2012, at 11:00 AM, Rüdiger Asche wrote:

> see my responses to Matthias for that)


I didn't see a response.


> ... so I wonder what a more natural or Schemish way there wsould be to 
> tackle such an issue…


Here is what I had in mind:

#lang racket

#| you write things like this:
(define-fsm ab*c
  ;; alphabet, with mapping to external events
  ([A 1][B 2][C 3])
  ;; states:
  (start seenA seen1B seenC)
  (start)
  (seenC)
  (on A: start -> seenA)
  (on B: seenA -> seenB)
  (on C: seenB -> seenC))

;; the macro expands to something like this:
|#

(define (driver event-stream)
  (define (start letter)
    (if (= letter 1) seenA (error "fsm1")))

  (define (seenA letter)
    (if (= letter 2) seenB (error 'fsm2 "~e" letter)))

  (define (seenB letter)
    (cond
      [(= letter 2) seenB]
      [(= letter 3) seenC]
      [else (error 'fsm3 "~e" letter)]))

  (define (seenC letter)
    (error "fsm: final state"))

  (let loop ([es event-stream][current-state start])
    (if (eq? current-state seenC)
"accepted"
(loop (remaining es) (current-state (next es))))))

;; one way to implement functional event stream:
(define remaining rest)
(define next first)

;; -----------------------------------------------------------------------------

(driver '(1 2 2 2 2 3))

;; -----------------------------------------------------------------------------

NOW: measure. I chose to represent functions as states for the heck of it.
You could imagine a big fat loop instead. You could imagine a big fat vector
of vectors instead to represent the transition matrix. The important thing
is you have
  (1) a text-book looking notation for FSMs
  (2) you measure high-level performance issues not in-lining of numbers,
which is a ternary bit at best.

For all you know, the Racket compiler produces the best possible code with
function pointers (as I have done) not with a transition matrix. Or perhaps
you have many FSMs, some work best with function pointers (for the density
of representation) and others work better with a transition matrix (because
the graph is dense). You can choose at expansion time, which representation
you want

 and yet you always write down text-book FSMs.

-- Matthias




Posted on the users mailing list.