[racket] JSON module: why symbols for object keys? lists for arrays?

From: Erik Pearson (erik at adaptations.com)
Date: Thu Jun 6 13:58:56 EDT 2013

Hi Eli,

The short answer is that I now love using symbols for keys!

In the back of my mind are still some reservations, but in day to day
coding, it is very pleasurable. E.g. since I often need to extract a
value from within a json or other dictionary-like structure (e.g. http
header, or sexpr config files), I can just use dict-ref or my little
dict-path (where the key is a list of symbols), using a symbol as the
key, and I it just works. I've also found the simplicity of the text
representation of symbols in code is "relaxing" (to borrow a couchdb
metaphor...)

The problems I had anticipated with jsexprs just have not materialized
or are easy to work around simply by coding convention.

On the templating issue -- I'd love to try out Scribble at some point.
I can see the advantage especially when doing more interesting things
server-side. I used to program like this years ago with a specialized
tcl interpreter, although not nearly as nice and proper as Scribble. I
could get a lot done rather quickly without disturbing the server
stability (since all code is restricted to that page) -- but I did
find that no-one else would touch those pages (or if they did, I'd
often have to come to the rescue.)

Simple programming-free templating like Mustache has disadvantages for
sure, one of which is that in the operations on data are very limited,
and require the addition of new operators written in the host language
(not mustache, of course). That is one reason it is suitable, I feel,
for browser client side (although it was invented for server side) --
it is relatively easy to process the data before the template sees it
as well or add new operators to mustache (in the form of data
transformers or "modifiers" that solely live in the scope of that
specific template object). It is also easy for web designers to just
copy/paste capability into a page without having to really learn
Javascript.

Anyway, I'd like to add a final note that the programmer in me has
been just tickled with Racket (after I switched to a faster machine to
tame DrRacket...) ... coming from CL, recently being immersed in
Erlang, it is just wonderful.

Erik.

On Thu, Jun 6, 2013 at 7:55 AM, Eli Barzilay <eli at barzilay.org> wrote:
> [Late reply, since the other thread reminded me of this.  Might be
> irrelevant for your actual decisions by now...]
>
>
> On April 22nd, Erik Pearson wrote:
>> Hi Eli,
>>
>> Wow, thanks for the great feedback.
>>
>> On Mon, Apr 22, 2013 at 3:21 PM, Eli Barzilay <eli at barzilay.org> wrote:
>> > General comment: unless you have an explicit goal of supporting
>> > Mustache templates as-is, you should consider doing things in
>> > plain Racket.  IMO it works far better than the pile of half-baked
>> > hacks that is Mustache (at least AFAICS).  (I can go on for pages
>> > on this point, so I'll avoid it unless anyone's interested...)
>>
>> I have a some extensive web sites built on Mustache-style
>> templates. I say style, because it is more based on the original
>> ctemplate syntax from Google, and avoids some of the Mustache
>> limitations or Handlebars extensions. It is very practical, in my
>> experience, and easy enough to implement (since it sticks with the
>> very simple syntax.)
>
> There are two different possibilities here: (a) you have some existing
> mustache material that needs to be supported as-is.  If this is the
> case, then there is definitely a good point in implementing it.  But
> then there's (b) you have some need for some "simple templating" and
> mustache must be it if so many people use it -- and if that's the case
> then I really don't buy it.
>
> The reason for that should be obvious in the Racket world, where we
> try to do as much work as possible in the form of a proper language
> instead of some semi-DSL-hack like regexp-replacing "{{\\w*}}" with
> strings.  This is something that I tell students over and over again,
> even though most of them don't remember it (and I tell them that most
> of them won't) some do and I've even had a few come back to me
> sometime later and tell me horror stories of creating bad DSLs and the
> amount of generated grief...  The thing is that it starts simple as
> that regexp-replacing, but soon enough you want to add more
> functionality, so you do so, bits by bits, and you end up with a
> language.  Usually a very bad one.  In contrast, if you *start* with
> an existing language -- and the choice doesn't matter here, JS would
> do just as well -- then all you need is just use the language.  For
> example, it took me a few seconds to find this gem in the handlebars
> page:
>
>     {{#list people}}{{firstName}} {{lastName}}{{/list}}
>
> with some code that implements that "list helper" -- compare that with
> the scribble/text way (which is the same thing that gets used in the
> server) of just using plain racket:
>
>     ...text...
>     @(for/list ([x some-list])
>         @list{@dict-ref[some-list 'first] @dict-ref[some-list 'last]})
>     ...text...
>
> or scribble/html which adds html-tag functions:
>
>     ...text...
>     @ul{@(for/list ([x some-list])
>            @li{@dict-ref[some-list 'first] @dict-ref[some-list 'last]})}
>     ...text...
>
> On a shallow look, I painfully realize that many mustache users will
> cry about how much more complicated this is -- conveniently forgetting
> the actualy implementation of that helper, but also missing the fact
> that because it's a generic language, there's nothing that prevents me
> from making this into a helper --
>
>     (define (dict-list->ul l)
>       @ul{@(for/list ([x l])
>             @li{@dict-ref[some-list 'first] @dict-ref[some-list 'last]})})
>
> and the template becomes
>
>     ...text...
>     @dict-list->ul[some-list]
>     ...text...
>
> Even though the difference seem small, it's really conceptually huge:
> you always have a general language, so you can immediately do whatever
> it is that you can do with Racket itself.  No reason to resort to
> additional regexp hacks like {{{..}}} vs {{..}}, or {{foo.bar.baz}}
> which is re-implementing JS-like syntax instead of treating it as JS,
> eg, the additional hacks of {{../foo.bar}}, {{./foo.bar}},
> {{this/foo.bar}}, and {{this.foo.bar}}.  (When you get to use these it
> should be painfully clear that you're using a new language, only one
> that is probably going to be different enough from JS that a
> near-future bite seems inevitable.)
>
> Here's another example, continuing down that page:
>
>   * You can register a helper with the Handlebars.registerHelper
>     method
>
> Q: what happens when two bits of code register a helper by the same
> name?  That's a rhetorical question -- I'm sure that the answer is
> pretty obvious.  But here's the thing: the same question looks very
> different in the Racket context -- while it's possible to do such a
> similar registration thing (ie, mutate a dictionary), most sane code
> won't do that, and this is a result of a design that was literally the
> subject of a few phd works -- which means that instead of "Idono, if
> it works for me then who cares?" you get something that people spent a
> ton of time designing.
>
>
>
>> >> For me there is also the increase in complexity when translating
>> >> from JSON to jsexpr -- when components of JSON are translated
>> >> into different objects in the host language it is yet another
>> >> thing to remember, multiple forms of symbols, another layer of
>> >> coding.  [...]
>> >
>> > I'm not following what it is that is more complicated here.  The
>> > fact that there are different concrete syntaxes for the same
>> > symbols is not different from strings which have the same issue.
>> > But either way, this shouldn't be an issue since you shouldn't
>> > care about the actual JSON source and just use the values that you
>> > read it as.  (So I'm guessing that I'm missing something here.)
>>
>> On the face of it, json defines strings, so does Racket, strings are
>> used as values, strings are used as keys, why mess that up? Using
>> symbols to me makes it more complicated in a couple of ways.
>
> It would be helpful if you can show some examples where it's making
> things more difficult.  (That's a real question.  For example, you
> might run into some need to use `symbol->string' because some JSON
> source gives you names that actually encode some substructure in
> them.  But I imagine that these would be extremely rare cases.)
>
>
>> What is the impact of putting arbitrary user data into the symbol
>> space?  Performance? Symbol table exhaustion? Some other
>> interference with program logic due to symbol corruption? I don't
>> know.
>
> The choice of symbols is the same as its use for identifier names in
> source code.  It's just a string-like type that is more convenient
> when it's used to identify things -- and this seems to be the
> intention in JS dictionaries.  It makes sense to use strings in the
> JSON *representation*, since there it's a simplification, but it
> doesn't make sense in its use.  It's similar to how you can refer to
> these things in JS as x.y instead of being forced to use x["y"].
>
> (Note related to the above parenthetical comment: I imagine those case
> to be rare in exactly the same way that having to use x["y"] is rare.)
>
>
>> Unless there is a good reason, why bother with introducing these
>> unknowns? Maybe this is a CL bias on my part?
>
> Not at all -- this design decision should apply to all Lisps --
> probably even a bit *more* for ones that are not in the Scheme
> subspace...
>
>
>> For another, the representation of symbols in the reader can be
>> either single quote for simple symbols, or bar-delimited for more
>> complicated ones. This makes creation of json literals in racket a
>> bit of a pain. In general, I don't see a good reason to conflate
>> strings with symbols in this case. (I do recognise the familiarity
>> of symbols for hash keys in Scheme, tho.)
>
> That's something that I didn't understand back then too.  You get the
> same thing with strings:
>
>     -> "foo bar"
>     "foo bar"
>     -> "foo\40bar"
>     "foo bar"
>     -> "foo\x20bar"
>     "foo bar"
>     -> "foo\u0020bar"
>     "foo bar"
>     -> "foo\U00000020bar"
>     "foo bar"
>     -> "\146\157\x6f\u0020\
>     \u62\x61\162"
>     "foo bar"
>     -> #<<MEH
>     foo bar
>     MEH
>     "foo bar"
>
> but that shouldn't be a problem for any code, I think.
>
>
>> >> There is a similar issue with lists being used to represent JSON
>> >> arrays, over the more obvious choice of Racket vector. Maybe this is
>> >> because there are more core functions for dealing with lists
>> >> compared to the limited number for vectors (for data processing type
>> >> things like foldl). I suppose it is YMMV depending on how you use
>> >> the data.  Random element access, simple iteration, or more complex
>> >> folding, etc.
>> >
>> > Here too, I think that the vague intention is "some ordered list of
>> > values", so it makes sense to use he datatype that is most common in
>> > your language.  In JS this is obviously arrays, and in all Lisps I
>> > think that lists make more sense.  For most cases I think that the
>> > performance consideration is irrelevant anyway, since the lists would
>> > be very short, and since you usually view them as a list rather as a
>> > random-access ordered mapping.  If you get to a point where the cost
>> > of lists makes a difference, then my guess is that using JSON can
>> > become questionable -- and in rare cases that it does make sense
>> > (perhaps because some upstream you don't control), some streaming API
>> > as Neil mentioned can make more sense.
>>
>> Yeah, I can see that, and it doesn't really make much of a difference.
>> In the CL implementation, I sometimes pine for an array to be a list.
>> But it works through an api, mostly, so that detail is not normally
>> important. To implement bidirectional translation between native types
>> to and json types, though, it is something to consider. For instance,
>> if you use alists for objects, then it is sensible to pick vectors for
>> arrays, so that you can do simple type matching. With objects
>> represented as hash tables, list is available. Of course, types
>> specifically designed for json make this moot. (But a native
>> representation is always useful to have.)
>
> I'm not following the "native" point here...
>
>
>> > The reason for the parameterized null value was that the original
>> > code used the #\null character to represent nulls, which is
>> > something that I viewed as a bad type pun...  So I left in a
>> > parameter and an argument to make it easy to use it for easy
>> > porting if needed.
>>
>> In my CL implementation, I opted for representing null, true, and
>> false as :NULL, :TRUE, and :FALSE, to avoid any conflation between
>> json and lisp. Is annoyingly easy to introduce nil (in CL) or null,
>> #f, #t from functions which naturally return these values, most often
>> from functions which return null or #f upon failure. Using explicit
>> symbols is rarely annoying, and never confusing.
>
> Yeah, it's a common tradeoff -- using Racket types makes it easier to
> write code, but risks getting bugs when you get such a value by
> mistake.  At the other extreme, define some `js' constructor, and now
> you deal with (js foo) throughout the whole tree of values (wrapping
> booleans, lists, strings, and dicts) -- but you're also practically
> eliminating all chances of such pun-related-errors.
>
> --
>           ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>                     http://barzilay.org/                   Maze is Life!



-- 
Erik Pearson
Adaptations
;; web form and function

Posted on the users mailing list.