Skip to Content.
Sympa Menu

shibboleth-dev - Solution to UTF-8 problem

Subject: Shibboleth Developers

List archive

Solution to UTF-8 problem


Chronological Thread 
  • From: Scott Cantor <>
  • To:
  • Subject: Solution to UTF-8 problem
  • Date: Thu, 08 May 2003 02:35:31 -0400
  • Importance: Normal
  • Organization: The Ohio State University

After some experimenting, I confirmed that the current code base has a major
systemic bug in it, in that it exchanges data in XML as
UTF-8 but then transcodes everything into the "local code page", typically an
ASCII derivative, when it passes the data up to Apache
for comparisons and export.

There's not a perfect solution to this really, since most C code pages don't
really work with Unicode very well (too many
untranslatable characters), and Apache of course only knows about C strings.

The best compromise is probably to standardize more or less on UTF-8 and
simply disguise that inside C strings, since ASCII strings
are directly equatable with it, and null is null in both. I don't have a
robust implementation of this yet, but a prototype
mod_shibrm is working that transcodes attributes into UTF-8, and I've
verified an end-to-end attribute with an accented character is
displaying correctly from Perl once I tell the browser it's a UTF-8 document.

I'm inclined to expose the character encoding to use as a system setting that
can be configured, in theory permitting a Japanese
user to select Kanji perhaps, and if Xerces supports the encoding it might
work, sort of.

This may delay us a tiny bit, since the transcoding interface is pretty
primitive and I have to be careful with it, but since the
Swiss need this, I guess we owe it to them to try.

-- Scott

------------------------------------------------------mace-shib-design-+
For list utilities, archives, subscribe, unsubscribe, etc. please visit the
ListProc web interface at

http://archives.internet2.edu/

------------------------------------------------------mace-shib-design--




Archive powered by MHonArc 2.6.16.

Top of Page