[racket] reading null-terminated byte-string?

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Thu Jan 2 14:17:15 EST 2014



No built-in function but easy to define like this: 

#lang racket 

(module+ test
  (require rackunit)
  (define Bytes (list->bytes '(102 111 111 0 98 97 114)))
  (check-equal? (with-input-from-bytes Bytes read-nt-string) #"foo")
  (check-equal? (with-input-from-bytes #"" read-nt-string) #""))

;; -> Bytes 
; read a null-terminated string
(define (read-nt-string) 
  (define next (regexp-match "(.*)\0" (current-input-port)))
  (if (boolean? next) #"" (second next)))





On Jan 2, 2014, at 1:22 PM, David Richards <contactguitarist at gmail.com> wrote:

> Hi Matthias,
> 
> Pardon my coding style:
> 
> (define Input (open-input-bytes (list->bytes '(102 111 111 0 98 97 114))))
> 
> (define (seek-byte Byte Port)
>   (define (_seek-byte Byte Port Pos)
>     (if (equal? Byte (peek-byte Port Pos))
>         Pos
>         (_seek-byte Byte Port (+ 1 Pos))))
>   (_seek-byte Byte Port 0))
> 
> (define (read-nt-string Port) ; read a null-terminated string
>   (define Length (seek-byte 0 Port))
>   (define Value (read-bytes Length Port))
>   (read-bytes 1 Port) ; consume terminator
>   Value)
>   
> (read-nt-string Input) ; => #”foo"
> 
> So, what built-in procedure is equivalent to “read-nt-string”?
> 
> “read-bytes-line” only permits (or/c 'linefeed 'return 'return-linefeed 'any 'any-one), not #"\0”. It’s only useful for generic 7-bit ASCII text with standard line endings. Not useful at all for general byte streams with 8-bit content.
> 
> Why just add the ability to terminate with an arbitrary byte, or even an arbitrary byte-string?
> 
> Admittedly a “line” typically ends with  (or/c 'linefeed 'return 'return-linefeed 'any 'any-one), so is there another library procedure that addresses this basic operation?
> 
> Obviously I can solve the problem as above with some ‘cobble code’, but there’s no way I’m going to address buffering, efficiently-sized block reads, vector scans, and all the other ‘inside stuff’ that is likely being done by the library procedures to optimize IO speed. Luckily I didn’t have a large data-set to process. Only about 500 MB, with no real-time demands. Otherwise all those calls to peek-byte would surely have killed me. I’d strongly prefer to use a library procedure, if it exists. And I’d love to know why it doesn’t exist, if it doesn’t exist.
> 
> Thanks.
> 
> dr
> 
> 
> 
> On Jan 2, 2014, at 9:42 AM, Matthias Felleisen <matthias at ccs.neu.edu> wrote:
> 
>> 
>> There are many ways to read bytes (and I assume you mean 'byte' not 'char' or 'string'). Here is how to read a complete line: 
>> 
>> Welcome to Racket v6.0.0.1.
>>> (read-bytes-line)
>> #""
>>> (read-bytes-line)"hello world, how is david"
>> #"\"hello world, how is david\""
>> 
>> 
>> The definitive reference is at http://docs.racket-lang.org/reference/Byte_and_String_Input.html
>> 
>> If this is not helpful, try to ask the question again. -- Matthias
>> 
>> 
>> 
>> 
>> 
>> On Jan 1, 2014, at 12:31 PM, David Richards <contactguitarist at gmail.com> wrote:
>> 
>>> 
>>> How do I read a value-terminated byte-string from an input port (i.e. a null-terminated string)?
>>> 
>>> dr
>>> ____________________
>>> Racket Users list:
>>> http://lists.racket-lang.org/users
>> 
> 



Posted on the users mailing list.