shibboleth-dev - Solution to UTF-8 problem
Subject: Shibboleth Developers
List archive
- From: Scott Cantor <>
- To:
- Subject: Solution to UTF-8 problem
- Date: Thu, 08 May 2003 02:35:31 -0400
- Importance: Normal
- Organization: The Ohio State University
After some experimenting, I confirmed that the current code base has a major
systemic bug in it, in that it exchanges data in XML as
UTF-8 but then transcodes everything into the "local code page", typically an
ASCII derivative, when it passes the data up to Apache
for comparisons and export.
There's not a perfect solution to this really, since most C code pages don't
really work with Unicode very well (too many
untranslatable characters), and Apache of course only knows about C strings.
The best compromise is probably to standardize more or less on UTF-8 and
simply disguise that inside C strings, since ASCII strings
are directly equatable with it, and null is null in both. I don't have a
robust implementation of this yet, but a prototype
mod_shibrm is working that transcodes attributes into UTF-8, and I've
verified an end-to-end attribute with an accented character is
displaying correctly from Perl once I tell the browser it's a UTF-8 document.
I'm inclined to expose the character encoding to use as a system setting that
can be configured, in theory permitting a Japanese
user to select Kanji perhaps, and if Xerces supports the encoding it might
work, sort of.
This may delay us a tiny bit, since the transcoding interface is pretty
primitive and I have to be careful with it, but since the
Swiss need this, I guess we owe it to them to try.
-- Scott
------------------------------------------------------mace-shib-design-+
For list utilities, archives, subscribe, unsubscribe, etc. please visit the
ListProc web interface at
http://archives.internet2.edu/
------------------------------------------------------mace-shib-design--
- Solution to UTF-8 problem, Scott Cantor, 05/08/2003
- Re: Solution to UTF-8 problem, Derek Atkins, 05/09/2003
Archive powered by MHonArc 2.6.16.