[OSM-dev] osmosis utf-8

Brett Henderson brett at bretth.com
Thu Nov 8 12:24:03 GMT 2007


Martijn van Oosterhout wrote:
> On Nov 8, 2007 12:59 PM, Brett Henderson <brett at bretth.com> wrote:
>   
>> That lines up with what Tom was saying about MySQL using a
>> windows-1252-like encoding.  I'm feeling a little silly, I tried to find
>> the name of the 1252 encoding yesterday to try it out and came to the
>> conclusion java didn't support it, I was wrong (not sure why I didn't
>> see it ...).  I might have fixed this sooner.
>>
>> Check out:
>> http://planet.openstreetmap.org/daily/test-cp1252.osc
>>     
>
> Looks good when I load it into my utf-8 editor. I'd say let it go over
> the entire planet dump and see how it goes. There are thousands of
> possible character sets, I just looked through th likely ones.
>
> Glad to help.
>   

Gah, I just dumped the entire day of the 7th.  The output is broken for 
this node:
http://www.openstreetmap.org/api/0.5/node/21683296/history

The dump is here:
http://planet.openstreetmap.org/daily/test-07-08.osc.gz

I guess Cp1252 isn't quite what mysql uses after all.  Although it seems 
like we're on the right track.  Perhaps I need to write my own encoding 
...  I guess I need to find out what mysql truly does use for latin1.  
Anyway, bed time for now.

Hmm, I better back out my change or the next daily dump will be broken ...





More information about the dev mailing list