[OSM-talk] [OSM-dev] Latest planet.osm contains incorrect data

Christopher Schmidt crschmidt at metacarta.com
Thu Dec 7 00:12:42 GMT 2006


On Thu, Dec 07, 2006 at 12:04:14AM +0100, Rapha?l Jacquot wrote:
> Ralf Zimmermann wrote:
> > Hi Guys,
> > 
> > I was away a few days. I now checked the latest planet.osm from 05-Dec. 
> > It seems to have encoding issues again. For example "M?nchen" (Munich) 
> > is not correct anymore. In the planet.osm from 28-Nov, this entry was 
> > still ok. The modification date of the node is the same, so in terms of 
> > encoding I guess the export script is broken again!
> > 
> > I checked a few streets with German Umlauts in the street name and they 
> > all are corrupted as well!
> > 
> > Were there any changes to the export script for planet.osm? If so, 
> > please check it for encoding issues.
> > 
> > Best regards,
> > 
> > Ralf
> > Munich/Germany
> 
> as I've been saying for a while, it appears the DB contains UTF8 and 
> Latin1 data mixed together in the tags fields.

Right. The data which is now 'wrong' in the dump is actually correct in
the db. Data which is right in the dump is in the db as iso-8859-1, but
the current planet exporter converts *everything* from latin-1 to
utf-8... even if it was already right. (oops.)

Regards,
-- 
Christopher Schmidt
MetaCarta




More information about the talk mailing list