[racket] JSON module: why symbols for object keys? lists for arrays?

From: Neil Van Dyke (neil at neilvandyke.org)
Date: Tue Apr 23 15:44:21 EDT 2013

General comments, which will be familiar to some...

If one is being very performance-conscious -- which one likely would be, 
if one is writing server request-processing code -- then how best to 
parse and represent the JSON depends on how it'll be used.

For example, for a large keyed-access collection that is accessed 
numerous times, of course one of will probably want hashes, keyed on 
interned IDs.  For small keyed-access collections that are accessed a 
small number of times, alist of string might well be better for 
allocation and speed.  And it can get more complicated once one thinks 
about allocations: e.g., large number of small collections kept around a 
long time, interned symbol keys might be better than strings for space 
or GC.  Then again, in a long-running program, if the keys are arbitrary 
and under untrusted control, and we're considering interning these keys, 
we might need to know whether/how our symbol table is GC'd.

That said, if one is simply writing an app, and unsure how much they'll 
have to scale or what the performance characteristics would be, one 
generally good strategy is to get it working with an implementation that 
seems reasonable enough, and then get empirical data of throughput under 
the desired kinds of loads.  From there, you can use tools like the 
"profile" library to focus on the hot spots.

There's also something to be said for having all the parts be 
high-performance from the start.  That's a lot harder to do, and not 
always worthwhile.  BTW, high-performance-parts-from-the-start doesn't 
preclude an iterative optimization process for the the larger system, 
but you generally start in a much better position.  (I can tell you that 
high-priced consultants, brought in improve performance, look more 
heroic if your system is *not* high-performance-parts-from-the-start.)

Neil V.


Posted on the users mailing list.