Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] Fetching Stems with Umlauts

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] Fetching Stems with Umlauts


Chronological Thread 
  • From: Mirko Tasler <>
  • To: Chris Hyzer <>,
  • Subject: Re: [grouper-users] Fetching Stems with Umlauts
  • Date: Mon, 14 Sep 2009 12:33:37 +0200


First of all, do you have a grouper client build after march 31st?

Tried my test case with grouperClient 1.4.2 prebuild, same result. (Do I need to update the WS too?)

Btw., the UI displays the umlaut correctly...
<span class="browseStemsLocationHere">[...]umlautä</span>
...but should better escape it (in this case to umlaut&auml; or to umlaut&#228;).

https://bugs.internet2.edu/jira/browse/GRP-263

Quite interesting since I had some trouble with Grouper WS, addMember and blanks, but the umlauts are probably not related to this (also I'm using gcStemSave and gcFindStems, not *Member).

I have a sample test java program to print out an umlaut: Test.java

Try this instead (save the file as UTF-8, you may also need to use "javac -encoding UTF-8"):

public class Test {
public static void main(String[] args) {
System.out.println((char) 228);
}
}

Most Linux distros out there now use UTF-8, so they should be able to handle that snippet correctly. For me, the setting is (using Fedora 11, btw.):

[tasler@pc~]$
env | grep LANG
LANG=de_DE.UTF-8
GDM_LANG=de_DE.UTF-8
But I guess you would prefer:
LANG=en_US.UTF-8

[...]

# this should probably be changed to UTF-8 for international
charsets... for US it can be: ISO-8859-1
grouperClient.default.fileEncoding = UTF-8

While ISO-8859-1 contains the ä, I still tried UTF-8, no luck :(
(Shouldn't this default to UTF-8?)

Thanks,

Mirko


-----Original Message----- From: Mirko Tasler
[mailto:]
Sent: Wednesday, September 02,
2009 4:29 AM To:

Subject:
[grouper-users] Fetching Stems with Umlauts

Dear Grouper-Users,

here at Freie Universität Berlin, we are creating Stems with
umlauts using GrouperClient like in *1 with "test:umlautä". This
works fine, and a quick "SELECT display_name FROM grouper_stems
WHERE display_name LIKE 'test:%umlau%'" also equals to the
originating stemName. However, when I try to look up this Stem with
GrouperClient with *2, the found Stem name in the WsStem[] Array
(fetched with found[0].getName()) has no umlauts but instead a char
with the value "0xefbfbd"/65533 (i.e., invalid Unicode). The
response from the Web Service at *3 seems correct. How do I fetch
the correct name? Any help would be greatly appreciated!

Thanks,

Mirko Tasler


*1 Code snippet saving Stem String stemName="test:umlautä"; WsStem
wsStem = new WsStem(); wsStem.setName(stemName); WsStemToSave
wsStemToSave = new WsStemToSave(); wsStemToSave.setWsStemLookup(new
WsStemLookup(stemName, null)); wsStemToSave.setWsStem(wsStem);
gcStemSave.addStemToSave(wsStemToSave);
gcStemSave.assignTxType(GcTransactionType.NONE).execute();

*2 Code snippet looking up Stem String parentName="test";
WsStemQueryFilter wsStemQueryFilter = new WsStemQueryFilter();
wsStemQueryFilter.setParentStemName(parentName);

wsStemQueryFilter.setStemQueryFilterType("FIND_BY_PARENT_STEM_NAME");


wsStemQueryFilter.setParentStemNameScope("ALL_IN_SUBTREE");
GcFindStems gcFindStems = new GcFindStems(); WsFindStemsResults
result =
gcFindStems.assignStemQueryFilter(wsStemQueryFilter).execute();
WsStem[] found=result.getStemResults(); int
c=found[0].getName().charAt(found[0].getName().size()-1); (leads to
c==65533, should be 228)

*3 Response snippet from the WS <stemResults> <WsStem>
<extension>umlaut[0xe4]</extension>
<displayExtension>umlaut[0xe4]</displayExtension>
<description>test:umlaut[0xe4]</description>
<displayName>test:umlaut[0xe4]</displayName>
<name>test:umlaut[0xe4]</name>
<uuid>2ad6c0e2-8f0b-4cdc-acb7-04e142e2588e</uuid> </WsStem>
</stemResults>

--
Mirko Tasler
ZE für Datenverarbeitung, Compute- & Medien Service
Freie Universität Berlin



Archive powered by MHonArc 2.6.16.

Top of Page