[racket] Is racket suitable for such a project?

From: Neil Van Dyke (neil at neilvandyke.org)
Date: Tue Feb 11 10:11:59 EST 2014

I can't answer the question in the Subject header off-the-cuff, but I 
can comment on some of the details...

For an onion router that has to handle lots of traffic at high speed, my 
first guess would be C or maybe C++ (not Go or Java), though I can 
appreciate wanting a higher-level language for this task.

I might also look at Erlang.

I would consider Python only if the project involves huge numbers of 
people running onion nodes, and you want them to be able to inspect the 
software themselves.  Python is far from perfect for this task, but it 
would be good in that large numbers of people can read it, and a better 
overall fit than JS.

If you want to do it in Racket, also take a look at Gambit C, which is 
my Scheme-y backup if I ever need lots of threads.

If you want to get creative, you might see whether you build this from 
domain-specific languages in Racket, but have Racket target C code.  
Like Pre-Scheme with DSLs.  (Also, going to a hardware description 
language from DSLs.)

"#lang racket" is for demos, IMHO; I *always* use "#lang racket/base" 
for any code that's not a demo.

A tree-shaker like you suggest might still be appropriate even with 
"racket/base", mainly if one wants to put this on a low-space device 
like OpenWRT hardware.  (Just last night, I was trying to decide whether 
I wanted to fit the Linux kernel plus Racket and an app in the 256MB RAM 
of the Kindle 3.)

If you want to spread your Racket threads across CPUs/cores, then, yes, 
I would look at Racket places as well as homebrew worker threads.

The Racket VM startup time is longer than it used to be, and I no longer 
use it often as a quick command-line calculator.  (If filesystem and 
libraries aren't in Linux caches, it's almost 4 seconds before REPL 
prompt on my workstation.)  That might not have to be a problem for an 
onion-router, however, even if you're starting up lots of processes 
(since you might be able to start worker processes before they're needed).

If you're finding Racket run time (not startup time) sometimes slower 
than Python, it's possible, but I'm wondering whether the respective 
implementations of your code have been hand-optimized.

Good luck with your project.

Neil V,

Yuhao Dong wrote at 02/10/2014 06:49 PM:
> Hi,
>
> I'm trying to decide between Racket and Go on writing my onion-routing
> system inspired by Tor. Basically, a network server, involving lots of
> long-lived connections that often pass large amounts of data. The thing
> needs to be super scalable; I often find that these servers, although
> network servers, often become CPU-bound doing encryption and
> encapsulation of protocols, so I do have experience that this is not
> "premature optimization"!
>
> Go seems to be the go-to (pun intended) for scalable network things. It
> has super cheap threads, is statically compiled (easy to deploy to lots
> of machines), and apparently uses little memory and runs fast. However,
> after writing some low-level transports in Go, I found a few things I
> gripe about but seem to be unsolvable:
>
>   - Go doesn't have exceptions. This means checking errors over and over.
>   - Go uses lightweight user threads like Racket ("goroutines"), but
> isn't fully non-blocking under the hood for I/O. Thus some I/O
> operations actually spawn a kernel thread, and then kill it when the
> operation is done. This is unacceptable for my use case.
>   - Go isn't as fast as I imagined. My program is still CPU-bound at
> around 40 MB/s, and that's just counting pushing zeroes through the
> lowest-level transport protocol.
>   - Go's garbage collector is naive stop-the-world mark-and-sweep. My
> program keeps a large amount of state, and constantly pushes data and
> accepts connections. The mark-and-sweep process can take as much as 20
> seconds on large heaps, severly impacting user experience. Since the GC
> is nonmoving, heap fragmentation is also an issue. I searched around a
> bit, and it seems that "buffer reuse" is a thing in Go. Ugh.
>
> So I decided to take Racket, namely Typed Racket, a spin. I like it a
> lot, it's my favorite language after all, and especially I like the fact
> that I can use macros to simplify verbose patterns of code. DrRacket's
> REPL is also very helpful. I also find that Racket is much more
> performant than I imagined (I used to think of it as a Python-type slow
> scripting language). However, there are a few gripes I have, but seeing
> how flexible Racket is, I think they might be solvable:
>
>   - Racket threads don't seem to be able to utilize multiple CPU cores.
> Is there an idiomatic way to, say, form a pool of places and push
> threads to them? I imagine sharing data is very messy?
>   - The Racket VM is huuuuuuge, both on disk and in memory.
> Infuriatingly, using #lang racket seems to import every single #lang
> racket library into the memory space, and when I raco dist my program,
> every single library gets packaged. I know this sounds like whining, but
> why can't some sort of analysis phase be applied where only the
> libraries actually required for the program to run are loaded?
>   - Racket doesn't seem to be able to call raw C code or machine code in
> static libraries, instead requiring the code be compiled into a library.
> Is this related to the fact that Racket is run in a VM rather than
> compiled to machine code?
>   - Some things in racket are pathologically slow. As an example, try
> implementing a cipher with loops and array indices and bytestrings. It
> will end up orders of magnitude slower than, say, C or Go or Java, or
> sometimes even Python.
>
> Are there solutions to these problems? These aren't showstoppers by any
> means, but could finally end my endless dilemma between the two langs :)
>
> Thanks!
> Yuhao Dong
>    
>    

Posted on the users mailing list.