grouper-users - Re: [grouper-users] xml-export: Inconsistent enconding in export file
Subject: Grouper Users - Open Discussion List
List archive
- From: "GW Brown, Information Systems and Computing" <>
- To: Loris Bennett <>, Grouper Users Mailing List <>
- Subject: Re: [grouper-users] xml-export: Inconsistent enconding in export file
- Date: Fri, 10 Oct 2008 08:01:03 +0100
Hi Loris,
That would explain it - my sources.xml has:
<?xml version="1.0" encoding="utf-8"?>
and so SourceManager tried unsuccessfully to read the character as UTF-8.
In the absence of an explicit encoding the system default would be used - the System property 'file.encoding' shows your default and can vary depending on the OS.
On my system I get:
file.encoding=Cp1252
Easy way to check is to add the following target to a build.xml:
<target name="enc">
<echo message="file.encoding=${file.encoding}"/>
</target>
Should you want to you can override the default by setting the system property i.e
-Dfile.encoding=<encoding>
Gary
--On 10 October 2008 08:40 +0200 Loris Bennett <> wrote:
Hi Gary,
In my source.xml I have 'ä'. In hex it is C3A4, so the file is UTF-8. It
is maybe somewhat surprising that this works, since there is no
<?xml>-tag with the encoding.
Are there any external dependencies which play a role in XML parsing
which could vary from platform to platform? I am running 64bit Debian
etch.
Anyway, thanks for correcting the problem.
Loris
On Thu, 2008-10-09 at 16:17 +0100, GW Brown, Information Systems and
Computing wrote:
Hi Loris,
I think I've fixed this now - in CVS. The code which exports group and
stem attributes was 'escaping' the output, but the source name was not
being escaped.
I modified:
private void _writeSubjectSourceMetaData(Source sa)
in XmlExporter so that name is escaped - id was escaped already.
this.xml.internal_puts("name=" + Quote.single( XML.escape(sa.getName())
) );
Out of interest, what did you have in sources.xml - ä or ä? When I
tried ä SourceManager would not parse sources.xml.
Gary
--On 11 September 2008 12:24 +0200 Loris Bennett
<>
wrote:
> Hi,
>
> An import of an export from Grouper gave me the following error:
>
> [java] [Fatal Error] export-cats-and-dogs.xml:145:30: Invalid byte
> 2 of 3-byte UTF-8 sequence.
>
> Looking at the XML, I see that different encoding is used for the
> umlaut in the 'name' attribute of the 'source' tag to that used for the
> contents of the 'description' tag.
>
> <source id='fub'
> name='Freie Universität Berlin'
>
> class='edu.internet2.middleware.subject.provider.JDBCSourceAdapter'
> >
> <subjectType name='person'/>
> </source>
> </subjectSourceMetaData>
> </metadata>
> <data>
>
> <!-- 'fub' -->
> <stem extension='fub'
> displayExtension='FU Berlin'
> name='fub'
> displayName='FU Berlin'
> id='148c85b9-9ed8-4721-b33b-65c71025f938'
> >
> <description>Freie Universität Berlin</description>
>
> Replacing the 'ä' with 'ä' allows the import to succeed.
>
> Is this a known issue?
>
> Loris
>
> --
> Dr. Loris Bennett
> Computer Centre
> Freie Universität Berlin
> Berlin, Germany
>
----------------------
GW Brown, Information Systems and Computing
----------------------
GW Brown, Information Systems and Computing
- Re: [grouper-users] xml-export: Inconsistent enconding in export file, GW Brown, Information Systems and Computing, 10/09/2008
- Re: [grouper-users] xml-export: Inconsistent enconding in export file, Loris Bennett, 10/10/2008
- Re: [grouper-users] xml-export: Inconsistent enconding in export file, GW Brown, Information Systems and Computing, 10/10/2008
- Re: [grouper-users] xml-export: Inconsistent enconding in export file, Loris Bennett, 10/10/2008
Archive powered by MHonArc 2.6.16.