[OSM-talk] Re: [OSM-dev] planet.osm - fix

Joerg Ostertag (OSM Munich/Germany) openstreetmap at ostertag.name
Wed Aug 16 00:48:16 BST 2006


> Are your sure the fault is actually in 
> the database 

yes

> and not only in the planet dump? 

not only, but also 

> To me it seems like there are two characters (ρ and Ν) in that
> string that is broken, the rest looks like a perfect greek alphabet

yes this node is not only gramatically wrong. It is semantically wrong. A 
osm-node is in my point of view the wrong place for a complete Greek 
alphabet. name=... is meant to hold object names like streetnames, places 
names, ...

>  as far as I can see. And thus only the broken character should be
> fixed. However I do not know if the problem is in some editing
> tool,  the database or in the export script. 

As what I assume all of those.

> Then there are 4 or 500 more such examples in the planet dump.

And they where introduced the last month. SInce the last planet.osm dump was 
readable by all osm-tools. So we urgently need to fix this, so every 
tool/renderer/... which builds apon planet.osm will work again.
So we should really fix all of these broken names as soon as possible. I would 
really like to see the next planet.osm (probably this Thursday) work again. I 
think this is really important, since currently the planet.osm file is not 
really usable to any of the tools we have for osm. 

For really fixing it in the future we'll have to do more. 
For the following I'll just presuming that we will use UTF-8 in all future 
osm-releases.
If so i think we need to provide/work out the following patches for osm:
 Server: 
      - change all database tables to utf-8 coding
      - only allow inserting new nodes if they match standard utf8 coding.
 API:
      - read utf-8 characters and store them as such in the db
      - write utf-8 characters to clients
 Apache:
      - make apache not change any coding
 Clients:
      - encode everything to utf-8
      - display utf-8 characters

-- 
Jörg (Germany, Munich)

http://www.ostertag.name/
TeamSpeak2: ts2.ostertag.name, user: tweety, Channel: "GPS Drive"
irc://irc.oftc.net/#osm




More information about the dev mailing list