[plt-scheme] reencode-input-port and get-bindings/post

From: Neil Van Dyke (neil at neilvandyke.org)
Date: Wed Dec 24 12:15:17 EST 2008

Any ideas on this confusing behavior with "reencode-input-port" and 
"get-bindings/post", before everyone heads off to Christmas?

For this exercise, I want to make PLT 4.1.2's "get-bindings/post" 
interpret HTTP "POST" data in "iso-8859-1" unstead of "utf-8".

  (let ((port (reencode-input-port (current-input-port) ; in
                                   "iso-8859-1"         ; encoding
                                   #f                   ; error-bytes
                                   #f                   ; close?
                                   "http-post-reencode" ; name
                                   #f                   ; convert-newlines?
                                        ; enc-error
      (lambda () #f)
      (lambda ()
        ;; TODO: !!! This re-encoding works outside of CGI, but when inside,
        ;; "get-bindings/post" seems to see the original bytes...
        (parameterize ((current-input-port port))
      (lambda ()
        (close-input-port port))))

This code works correctly when run on Debian GNU/Linux *outside* of 
Apache CGI.  When the same code is dropped into Apache CGI,however, it 
generates an error as if the re-encoding weren't happening:

bytes->string/utf-8: string is not a well-formed UTF-8 encoding: #"\251 
[unknown source]: (call-with-continuation-prompt (lambda () 
(with-continuation-mark break-enabled-key bpz (with-continuation-mark 
....))) handler-prompt-key (lambda (thunk) (thunk)))
 === context ===
/usr/local/plt-4.1.2/lib/plt/collects/net/uri-codec-unit.ss:168:0: decode
/usr/local/plt-4.1.2/lib/plt/collects/net/cgi-unit.ss:116:0: read-name+value

I suspect I'm making a silly mistake or that there is some 
locale-related difference between the two situations.  I cannot 
reproduce the error outside of CGI by adjusting the "LANG" environment 


Posted on the users mailing list.