[plt-scheme] Why is the JIT Written in C?

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Tue Dec 1 13:25:52 EST 2009

At Tue, 1 Dec 2009 10:11:04 -0500, "Will M. Farr" wrote:
> I have a quick (mostly academic) question for the PLT maintainers: why is the 
> JIT written in C?

It's just how we've gotten from point A to point B. Adding a JIT that
needed itself to run fast wouldn't have worked for us in practice,
because it took (and still is taking) a while for the JIT to generate
code that runs as fast as a JIT needs to run. That is, the overhead of
a slow JIT would have prevented us from using the JIT to make
incremental progress, and so on.

There's very little of our C code that really should be in C, though;
your point is well taken. Unfortunately, it's unlikely that the JIT
will migrate out of C soon, though, because we're currently more
interested in other things, such as rebuilding the GUI (in Scheme
instead of C++!).

> Another reason for asking is that I occasionally think about implementing a 
> (numeric-mode ...) macro that compiles simple numeric loops, vector/f64vector 
> accesses, and floating/integer arithmetic directly with boxing only occurring 
> on entry/exit (kind of like Chicken scheme's crunch, or a MCL/OpenMCL 
> double-float arithmetic package whose name I have forgotten, or OCaml's local 
> unboxing strategy, etc...).  I don't think there's any hope of me managing to 
> implement this as long as the JIT is in its current state.

Yes, that's a problem. 

For what it's worth, we are working on that particular facet of the
JIT. The latest version provides `unsafe-fl+', etc. as well as
`unsafe-f64vector-ref' and `unsafe-f64vector-set!'. The JIT can skip
boxing for compositions of those operations.

For example, the convolution code below runs within a factor of 4 of
optimized C (compared to a factor of 15 for Scheme code using safe
operations). In the JIT-generated code, the floating-point numbers are
not boxed in the path from the input arrays to the output array.

There's still plenty of room for improvement in that factor of 4. About
half of the difference is the representation of f64vectors (a pointer
to a pointer to the data), and half of it the quality of the machine
code generated by the JIT.

Having the JIT in a form where you could help imporve it would be
ideal. Meanwhile, are there more things that a MzScheme JIT export
could add to move toward things like `numeric-mode'?

;; ----------------------------------------

#lang scheme/base
(require scheme/foreign)

  (require scheme/unsafe/ops)
  (define vr unsafe-f64vector-ref)
  (define vs! unsafe-f64vector-set!)
  (define vl f64vector-length)
  (define fl+ unsafe-fl+)
  (define fl* unsafe-fl*)
  (define fx* unsafe-fx*)
  (define fx+ unsafe-fx+)
  (define fx- unsafe-fx-)
  (define fx<= unsafe-fx<=)
  (define fx= unsafe-fx=))

(define (convolution1 signal kernel)
  (let ([result (make-f64vector (+ (+ (vl signal)
                                      (vl kernel))
        [klen (vl kernel)]
        [slen-1 (- (vl signal) 1)]) ; ditto
    (for ([i (in-range 0 (vl result))])
      (vs! result i 0.0)
      (let loop ([j 0])
        (unless (fx= j klen)
          (let ([k (fx- (fx+ (fx+ i j) 1)
            (when (and (fx<= 0 k) (fx<= k slen-1))
              (vs! result
                   (fl+ (vr result i)
                        (fl* (vr kernel j)
                             (vr signal k))))))
          (loop (fx+ j 1)))))

 (let ()
   (define signal (make-f64vector 100000 3.))
   (define kernel (make-f64vector 100 4.))
   (void (convolution1 signal kernel))))

Posted on the users mailing list.