[OSM-dev] osmosis utf-8

Martijn van Oosterhout kleptog at gmail.com
Tue Nov 6 10:14:12 GMT 2007


On 11/6/07, Tom Hughes <tom at compton.nu> wrote:
> > AIUI the data is simply doubly encoded in the DB. The JDBC driver is
> > doing the right thing by giving exactly what's in the database. I
> > don't think you're going to "trick" JDBC into working around a problem
> > like that.
>
> Just setting the connection character set to Latin-1 explicitly
> should work (it's what I do with mysqldump which defaults to using
> UTF-8) but it will only work for our broken server config and not
> for any sensibly setup databases.

My theory is that the JDBC driver sees the connection is Latin1 and
converts the incoming stream back to unicode. In Java all strings are
unicode, the JDBC drivers I know about automatically convert any
incoming stream as appropriate. They even go so far as to detect if
the user is trying to change the encoding.

The end result is that you always get exactly what's in the DB, no
matter what the config is. This is usually what you want, just it
isn't here...

Have a nice day,
-- 
Martijn van Oosterhout <kleptog at gmail.com> http://svana.org/kleptog/




More information about the dev mailing list