[OSM-dev] Osmosis and non latin characters, please add a force option

Brett Henderson brett at bretth.com
Thu Sep 18 05:21:10 BST 2008


Joachim Zobel wrote:
> Hi Brett.
>
> Could you please make the parameters
> useUnicode=true&characterEncoding=UTF-8
> switchable with a --force-utf8 option. Not everybody can or likes to
> change the database charsets. 
>   
This shouldn't be too hard to add.

Before I do it though, have you tried this to make sure it does what you 
expect?
The com.bretth.osmosis.core.mysql.common.DatabaseContext class has those 
options commented out currently.  If you haven't already, can you please 
compile osmosis yourself with these options enabled to make sure it 
exhibits the behaviour you're looking for?
> It may also be useful for some people to have a switchable
> useOldUTf8Behavior=true
> option (not shure about the spelling) to mimick the exact behaviour of
> the current production system.
>   
I haven't used this option before.  What behaviour do you need here?  
Osmosis isn't using this option when running against the OSM production db.

The production osmosis uses a hack to workaround the doubly encoded 
UTF-8 data but this is in the xml writing task, not mysql changeset 
reading task.  It uses a special Charset encoding to achieve this 
implemented in the class 
com.bretth.osmosis.core.xml.common.ProductionDbCharset.

Brett





More information about the dev mailing list