[racket] require url?

From: Greg Hendershott (greghendershott at gmail.com)
Date: Thu Apr 14 19:09:57 EDT 2011

> I suggested it a while ago, and there are still some questions that
> are difficult to answer.  One issue is caching -- you'll obviously
> need some cache, since files can be accessed many times (eg, use a
> simple macro that prints a line and run it in drracket), and the
> question is what caching policy should be used.  The main problem is
> how to differentiate the developer from the user -- in the latter case
> it's probably fine to poll the URL every some time (maybe once every
> hour, possibly with a way for the author to control the frequency),
> but the developer should be able to have a very high refresh frequency
> to allow fixing bugs and checking them.  That's probably not too
> problematic if the system is completely transparent and works exactly
> the same for files from a url and local files -- this way the
> developer can test files directly on the FS, while users get the
> hourly (or whatver) updates.

Possibly the usual HTTP caching mechanisms would suffice, including
ETag (which is simply MD5 of the entity on some web servers)?

If even such a light conditional GET were too much overhead for a
normal user?  Then maybe (require (url ...)) takes an optional arg to
do it, which I use as a dev. Otherwise it defaults to HTTP expiration
time caching, where the server suggests a good time to get back to it,
as it probably knows best.

On Sat, Apr 9, 2011 at 12:12 AM, Eli Barzilay <eli at barzilay.org> wrote:
> Earlier today, Noel Welsh wrote:
>> It has been talked about a lot, but no-one has implemented this
>> feature AFAIK. Should be possible, as require is extensible.
>
> I suggested it a while ago, and there are still some questions that
> are difficult to answer.  One issue is caching -- you'll obviously
> need some cache, since files can be accessed many times (eg, use a
> simple macro that prints a line and run it in drracket), and the
> question is what caching policy should be used.  The main problem is
> how to differentiate the developer from the user -- in the latter case
> it's probably fine to poll the URL every some time (maybe once every
> hour, possibly with a way for the author to control the frequency),
> but the developer should be able to have a very high refresh frequency
> to allow fixing bugs and checking them.  That's probably not too
> problematic if the system is completely transparent and works exactly
> the same for files from a url and local files -- this way the
> developer can test files directly on the FS, while users get the
> hourly (or whatver) updates.
>
> Another minor issue is making people aware of it all: what I wanted to
> get to is a system that makes it easy to share code for running *and*
> to look at.  This means that the cache should live in some visible
> place, unlike planet that stores files in an awkward place and in a
> complex hierarchy.  Ideally, this could even be hooked to drracket so
> instead of openning a file you'd open a URL and that will show you the
> cached file, which you can read through and run (as well as running it
> from another file, of course).  But a more difficult problem here is
> what to do in case of edits -- the obvious thing to do is to "break"
> the connection to the URL, so the file will no longer get updated from
> the web.  But in this case, how should the user know about all of
> this?  I really wanted this to be a super lightweight way for
> disributing code, which prohibits additional command-line tools etc.
> Also, the user should probably be made aware of the polling behavior,
> which could imply privacy issues (I make some very useful utility, and
> now I can look in my logs and see when you're working and your IP) --
> although it's becoming more common to just do these polls anyway...
>
> And probably the most difficult part is how to deal with multiple
> files: I write one file that looks like:
>
>  #lang racket
>  (require "blah.rkt")
>
> and Racket should know that "blah.rkt" should be retreived from the
> same URL base.  I'm not sure if this is possible to do without any
> low-level support -- but I don't want to end up with yet another
> specialized set of hooks for this functionality.  (Planet ended up
> having hooks in too many places, IMO, especially the setup code has a
> lot of code for it.)  "Obviously", one way to get this is if Racket
> would treat URLs as files at the lower level, doing the fetching and
> the polling so that everything works transparently, but this is also
> an "obviously" bad idea to do that kind of work at that level (and
> that's without even trying to make up reasons, I'm sure that Matthew
> can write books on why this would be a bad idea).
>
>
> Earlier today, Neil Van Dyke wrote:
>>
>> Could be useful.  You'd have to use caution in what URLs you
>> reference, since it's pretty much executing arbitrary code.  And,
>> unless you were always on a trusted network including trusted DNS,
>> you'd want to use HTTPS and a trusted CA list.  At that point, it
>> becomes less lightweight.
>
> Good point.  At some point I considered this a non-problem since it
> shouldn't be different than me putting out some code in any form and
> suggesting that you run it.  That's of course true for any kind of
> executable code -- but a little less problematic in this case since
> you can see the source.  However, the real problem is with malicious
> third party hacking -- I put out a useful piece of code which many
> people use, then someone hacks my web pages and replaces it with
> malicious code and everyone gets their code updated and run it without
> their knowledge.
>
> There is a good solution for this that I recently saw -- the way that
> chrome extensions are distributed.  The summary of what I understand
> from it is: an extension is a zip file with a prefix holding two
> things -- a public key for the extension, and a signature for the zip
> file that was done using the private key.  Clients can now use the
> public key and verify that the zip file was written by whoever holds
> the matching private key.  This is not helping against malicious code
> that I explicitly install -- but of course there's almost nothing to
> do against that if you want to avoid a central blessing authority
> (which google implements for chrome extensions).
>
> The nice thing is how extensions are updated: when I want to update an
> extension I wrote, I put out a file with the new contents, and I
> attach the *same* public key, and use it to create a signature for the
> new contents.  So now clients can be sure that the new file was
> created by the same author (assuming the private key was not stolen).
> So essentially that public key becomes the global identifier for the
> extension, and is actually used in extension URLs.
>
> AFAICT (and that's not saying much) this works well for the security
> aspect -- and for a commercial setting you can use the usual http
> security for the initial installation, or have a different way to
> install it.  The thing is that it fits planet(2) well, but it won't
> work for the lightweightedness that this (require (url ...)) is
> supposed to be, and I don't have a good idea how to do that.  (... in
> a way that doesn't require a special packaging tool etc -- just the
> minimal copying of the file(s) to the webserver's directory.)
>
> --
>          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>                    http://barzilay.org/                   Maze is Life!
> _________________________________________________
>  For list-related administrative tasks:
>  http://lists.racket-lang.org/listinfo/users
>



Posted on the users mailing list.