[Ace-users] [tao-users] client configuration: ensuring UTF8 encoding
mesnier_p at ociweb.com
Wed Nov 7 11:44:00 CST 2007
Thanks for the PRF.
It turns out that due to an amazing coincidence, just yesterday I found
out that TAO versions other than OCI TAO 1.4a do not process the UTF8
The problem is that TAO ships with a file,
ACE_wrappers/ace/Codeset_Registry_db.cpp, which carries the definition of
recognized codeset values happens to be lacking the definition of UTF8.
Since you are using an unsupported version of TAO, you need to generate
the update yourself: in ACE_wrappers/apps/mkcsregdb, create a text file
similar to the existing cs_test.txt file that contains all the codeset
definitions you need, including Latin1, UTF-8, UTF-16, Unicode, etc. You
can find all these in the large code_set_registry1.2g.txt file. You can
actually generate a new Codeset Registry db file using
code_set_registry1.2g.txt, but that will give you all known codesets,
which you probably don't need.
Anyway, run mkcsregdb, and rebuild ACE.
Vance Maverick wrote:
> I'm using TAO 1.5.4 as the client ORB, and JacORB 2.3.0 as the server, on an
> FC6 Linux box. I'd like to make sure UTF8 encoding is used for string
> transmission. This is the default for JacORB (now that I've upgraded) --
> executing the debugging entry point org.jacorb.orb.giop.CodeSet, I see
> System file encoding property: UTF-8
> Cannonical encoding: UTF8
> Default WChar encoding: UTF16
> (among other outputs, below). I can safely send and receive a test string
> with Japanese characters from my Java client (also using JacORB).
> My question is, how do I configure TAO to make sure it handles its end
"Handles" is a vague term. Since you are passing Japanese characters, I
suppose you want use utf8 natively. There are two ways to do that, once
you've fixed ACE as I indicate above. You can set a an optionion in
svc.conf, or you can hardwire it in the codeset manager class.
See http://ociweb.com/cnb/CORBANewsBrief-200209.html for more information.
> Right now, I'm passing "-ORBNegotiateCodesets 1" to CORBA::ORB_init. (And
> this did force me to link to libTAO_Codeset.)
This is redundent. Unless specially built, TAO defaults to negotiating
> However, this is not giving
> the desired result -- when I send a string with Japanese characters, my Java
> code on the server side doesn't receive the right decoded (UTF16) value.
UTF16 is a wchar codeset. Is your interface using strings or wstrings as
the argument type? If you are using string, then you will want set TAO to
use UTF-8 natively as I mentioned above.
Finally, I expect to patch the Codeset Registry db file in the DOC
group's TAO repository. Once that is done, you should be able to obtain
a copy of that file, which will be compatible with your version of ACE,
if you don't want to mess with generating it yourself.
Principal Software Engineer, http://www.ociweb.com
Object Computing, Inc. +01.314.579.0066
More information about the Ace-users