[racket-dev] Racket backend for my Python visualizer
On Aug 25, 2012, at 4:28 PM, Philip Guo wrote:
> [Cc'ed Frank (Lingfeng), who is the student doing the actual implementation. Frank, please chime in with thoughts too …]
Okay, I just pushed a version of Racket (f107c4d2658) that includes an "external interface" for the stepper, in plt/collects/stepper/external-interface.rkt. There's also an example of how you might use it, at plt/collects/stepper/examples/external-interface-exampler.rkt.
The basic idea is that you give it a file path or a string, and a handler procedure that will be called on each breakpoint, and it annotates and runs it, calling your function on each breakpoint.
The information passed to the handler is not terribly well-documented; let me give you a bit of background.
The first argument is a "mark list"; these correspond to continuation frames, which are a lot like stack frames except that they're at expression-level granularity. Each frame contains a source expression, a label, and a list of local bindings. There are functions in plt/collects/stepper/private/marks that can extract the information from these marks. This value can be #f, for an 'expr-finished-break (see descriptions below).
The second argument tells what kind of breakpoint it is.
KINDS OF BREAKPOINTS:
'normal-break : occurs before an expression's evaluation. The "before"
steps.
'result-exp-break : occurs before an expression's evaluation that is the
"after" of some other step. E.G., the rhs of a cond, the body of
a procedure, etc.
'result-value-break : occurs when a value is returned. Also corresponds
to an "after" step.
'normal-break/values : occurs before an expression's evaluation, when
that expression has intermediate values. For instance, an "if"'s
reduction occurs after its test value has been evaluated, so the
"before" step for the "if" must already know about the test value.
'expr-finished-break : occurs when a top-level expression is complete.
'double-break : occurs as part of the evaluation of a let or local,
after the evaluation of all the binding rhses. Called a double-break
because it must generate two steps.
The third argument is either false or a list of values; the result-value-break, normal-break/values, and expr-finished-break have lists of values here.
When you try running this on a program bigger than (+ 3 4), you're going to notice several things:
1) There are going to be a LOT of breakpoints. Thousands and thousands of them, for fairly small programs. I'm guessing that you're going to want to cut down the number of these quite substantially. The first and easiest thing to do would be to filter out the 'result-exp and 'result-value breaks.
2) There are also lots and lots of breakpoints that occur during the evaluation of code that was inserted as part of macro expansion. The most obvious clue here is that these frames have line and column values of #f, but you can also test this directly by applying 'syntax-source' to the source expression referred to by the mark.
3) You're going to be interested in the bindings in the breakpoints, because those are what will allow you to extract the bindings and heap values that you need to build your heap representation.
4) In the presence of mutation, you're going to want to construct that representation before the handler returns, because subsequent program evaluation will mess up that heap. Actually, that kind of goes without saying, doesn't it? :)
5) Some programs are going to break. In particular, the stepper's annotation is well-tested only for programs written in the student languages, and I'm assuming that your visualization tool will operate on top-level programs starting with "#lang racket". Let me know when you find programs that don't work.
Also, in writing the last paragraph, I realize that I may have made an unwarranted assumption when I assumed that you'd be operating on top-level, "#lang" programs. It may well be that you're actually interested in student language programs, which would be good in the sense that the stepper's annotation is more robust for these programs, and a bit of extra work in the sense that I'd have to doctor the annotation to take this into account.
I sincerely hope that you're able to make progress on this: I've seen that python visualizer output, and I would *love* to have something like that running for Racket programs.
All the best,
John Clements
(cc: racket-dev)
>
> See below ...
>
> On Sat, Aug 25, 2012 at 4:10 PM, John Clements <clements at brinckerhoff.org> wrote:
>
> On Aug 25, 2012, at 2:22 PM, Philip Guo wrote:
>
> > Great timing :) We're pretty stuck googling around for the right way to implement what we need. Basically we want something like this, except for Racket:
> > http://docs.python.org/library/bdb.html
> >
> > The Online Python Tutor backend runs a Python script under bdb, which enables it to single-step through execution one line at a time and dump out the complete stack and heap contents at every executed line. The entire execution trace is serialized as a JSON object and sent to the frontend, which renders it using some javascript graphics libraries.
> >
> > What's the easiest way to collect this sort of execution trace from a Racket program?
>
> The easiest way to collect this sort of execution trace is to use the stepper's annotation.
>
> Actually, the function that you want isn't *quite* there yet, but I can probably add it tonight. It will involve trimming down the stepper model's "go" function.
>
> In particular, it would take a racket program, and return a stream of states (where state includes stack, heap, etc.).
>
> Yes, that sounds really good! If you have time to do that soon, that would be amazing.
>
>
> One question about the set of steps you'd like to see: each "step" in the stepper consists of a left and a right-hand side, which are stapled together from two different breakpoints. So, for instance, the step
>
> (+ 3 (+ 4 5)) => (+ 3 9)
>
> is constructed from two breakpoints; one that's triggered before the addition, and one that's triggered after it returns.
>
> So… what's the granularity of your python tool? Can you characterize the stopping points? My naive guess is that you break "at the beginning of each line", which would match the "before" steps of the stepper. Is that right?
>
> Yes, I think that's right. The Python bdb debugger steps a single line at a time, so it doesn't step through subexpression evaluation.
>
> Does your tracer provide line and column number information? The frontend needs that info to know which portions of the code to highlight at every step; the Python tracer provides only line numbers, since it steps at line-granularity.
>
> Basically if we could hook into the stepper at each execution point (however fine-grained or coarse-grained it may be) and introspect the objects on the stack and heap, then that's all we need. We will just add code to dump that full trace into JSON format, which the frontend renders.
>
> Thanks again,
> Philip
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4800 bytes
Desc: not available
URL: <http://lists.racket-lang.org/dev/archive/attachments/20120827/3047540d/attachment-0001.p7s>