[racket] extracting "docstrings" from documentation

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Mon Dec 19 15:26:48 EST 2011

I don't know that anything better is available right now, but maybe the
question should be: What should Scribble provide?

Originally, I had in mind including docstring-like information in the
cross-reference output of a Scribble document. That approach would work
badly with the current implementation of cross-reference information,
however, because the information already takes too much memory. (On a
32-bit machine, around 20MB of DrRacket's initial footprint is
cross-reference information for installed documentation, and that cost
doubles when online check syntax is enabled.) Probably cross-reference
information should actually be in a database, instead of a serialized
hash table, but I haven't yet tried anything in that direction.

Any other ideas?

At Mon, 19 Dec 2011 14:42:07 -0500, Danny Yoo wrote:
> I'm trying to extract documentation strings for all the functions in
> racket/base.  By documentation strings, I truly mean strings.  Here's
> the progress I'm making on this:
> 
>     https://github.com/dyoo/extract-docstring
> 
> It's buggy still, and I'm working out the kinks.
> 
> 
> The process I'm using to approach this is frankly a little insane, and
> I would rather not go to the nuthouse for this.  I'm using setup/xref
> and scribble/xref to figure out the source line and anchor of a
> binding.  Next, I parse the HTML, grab at the element with the given
> anchor name, and start sucking up HTML till I hit the next anchor.
> 
> 
> I am web-scraping, and I know I should be ashamed of myself.  But I do
> not see any other mechanisms available to me at the moment.  Have I
> missed something obvious?



Posted on the users mailing list.