shibboleth-dev - Re: Solution to UTF-8 problem
Subject: Shibboleth Developers
List archive
- From: Derek Atkins <>
- To: Scott Cantor <>
- Cc:
- Subject: Re: Solution to UTF-8 problem
- Date: 09 May 2003 10:53:45 -0400
I was afraid this was part of the issue.. :(
-derek
Scott Cantor
<>
writes:
> After some experimenting, I confirmed that the current code base has a
> major systemic bug in it, in that it exchanges data in XML as
> UTF-8 but then transcodes everything into the "local code page", typically
> an ASCII derivative, when it passes the data up to Apache
> for comparisons and export.
>
> There's not a perfect solution to this really, since most C code pages
> don't really work with Unicode very well (too many
> untranslatable characters), and Apache of course only knows about C strings.
>
> The best compromise is probably to standardize more or less on UTF-8 and
> simply disguise that inside C strings, since ASCII strings
> are directly equatable with it, and null is null in both. I don't have a
> robust implementation of this yet, but a prototype
> mod_shibrm is working that transcodes attributes into UTF-8, and I've
> verified an end-to-end attribute with an accented character is
> displaying correctly from Perl once I tell the browser it's a UTF-8
> document.
>
> I'm inclined to expose the character encoding to use as a system setting
> that can be configured, in theory permitting a Japanese
> user to select Kanji perhaps, and if Xerces supports the encoding it might
> work, sort of.
>
> This may delay us a tiny bit, since the transcoding interface is pretty
> primitive and I have to be careful with it, but since the
> Swiss need this, I guess we owe it to them to try.
>
> -- Scott
>
>
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
PGP key available
------------------------------------------------------mace-shib-design-+
For list utilities, archives, subscribe, unsubscribe, etc. please visit the
ListProc web interface at
http://archives.internet2.edu/
------------------------------------------------------mace-shib-design--
- Solution to UTF-8 problem, Scott Cantor, 05/08/2003
- Re: Solution to UTF-8 problem, Derek Atkins, 05/09/2003
Archive powered by MHonArc 2.6.16.