[racket] extracting "docstrings" from documentation
I don't know that anything better is available right now, but maybe the
question should be: What should Scribble provide?
Originally, I had in mind including docstring-like information in the
cross-reference output of a Scribble document. That approach would work
badly with the current implementation of cross-reference information,
however, because the information already takes too much memory. (On a
32-bit machine, around 20MB of DrRacket's initial footprint is
cross-reference information for installed documentation, and that cost
doubles when online check syntax is enabled.) Probably cross-reference
information should actually be in a database, instead of a serialized
hash table, but I haven't yet tried anything in that direction.
Any other ideas?
At Mon, 19 Dec 2011 14:42:07 -0500, Danny Yoo wrote:
> I'm trying to extract documentation strings for all the functions in
> racket/base. By documentation strings, I truly mean strings. Here's
> the progress I'm making on this:
>
> https://github.com/dyoo/extract-docstring
>
> It's buggy still, and I'm working out the kinks.
>
>
> The process I'm using to approach this is frankly a little insane, and
> I would rather not go to the nuthouse for this. I'm using setup/xref
> and scribble/xref to figure out the source line and anchor of a
> binding. Next, I parse the HTML, grab at the element with the given
> anchor name, and start sucking up HTML till I hit the next anchor.
>
>
> I am web-scraping, and I know I should be ashamed of myself. But I do
> not see any other mechanisms available to me at the moment. Have I
> missed something obvious?