[plt-scheme] Performance Targets for MzScheme
from Python's src/Objects/fileobject.c:
static PyObject *
file_readlines(PyFileObject *f, PyObject *args)
{
[huge C function...120 lines]
}
from Python's src/Objects/stringobject.c:
static PyObject *
string_split(PyStringObject *self, PyObject *args)
{
[snip...8 lines]
if (subobj == Py_None)
return split_whitespace(s, len, maxsplit);
[snip...52 lines]
}
and...
static PyObject *
split_whitespace(const char *s, int len, int maxsplit)
{
int i, j;
PyObject *str;
PyObject *list = PyList_New(0);
if (list == NULL)
return NULL;
for (i = j = 0; i < len; ) {
while (i < len && isspace(Py_CHARMASK(s[i])))
i++;
j = i;
while (i < len && !isspace(Py_CHARMASK(s[i])))
i++;
if (j < i) {
if (maxsplit-- <= 0)
break;
SPLIT_APPEND(s, j, i);
while (i < len && isspace(Py_CHARMASK(s[i])))
i++;
j = i;
}
}
if (j < len) {
SPLIT_APPEND(s, j, len);
}
return list;
onError:
Py_DECREF(list);
return NULL;
}
Daniel
On Wed, 12 May 2004 20:32:26 -0400, Matthias Felleisen
<matthias at ccs.neu.edu> wrote:
>
> For list-related administrative tasks:
> http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>
> Could you please check whether the string tokenizer codes are written
> in Python or Perl respectively? It is my impression that Python is just
> a scripting language for C-libraries. -- Matthias
>
>
>
>
> On May 12, 2004, at 8:25 PM, Brent Fulgham wrote:
>
> >
> > --- Matthias Felleisen <matthias at ccs.neu.edu> wrote:
> >> For list-related administrative tasks:
> >>
> >> [matthias-ti:~/Desktop] matthias% mzscheme -r
> >> test-gen.ss
> >> [matthias-ti:~/Desktop] matthias% time ./testfile.py
> >> 1.380u 0.090s 0:01.51 97.3% 0+0k 0+1io 0pf+0w
> >>
> > [ ... snip ...]
> >> [matthias-ti:~/Desktop] matthias% time mzscheme -r
> >> testfile.ss
> >> 1.820u 0.160s 0:02.04 97.0% 0+0k 0+0io 0pf+0w
> >>
> >> Okay, we lose by either 4.5 or .4 depending on how
> >> you count. That is
> >> slower but not an order of magnitude.
> >
> > Excellent !
> >
> > It looks as though the culprit is string-tokenize. I
> > also tried a regular-expression-based version and
> > found it to be about an order of magnitude faster.
> >
> > For comparison:
> >
> > tokenize regular-expr
> > Perl 2.5721 2.3654
> > Python 1.8225 1.7576
> > PLT Scheme 40.5884 3.6522
> >
> > This is with the naive do-ec implementation of the
> > loop, not processing directly to the port (so there is
> > room for improvement). The second set of numbers is
> > much better.
> >
> > -Brent
>
>