[OSM-dev] Latest planet.osm contains incorrect data

SteveC steve at asklater.com
Thu Dec 7 11:08:12 GMT 2006


* @ 07/12/06 09:34:29 AM pere at hungry.com wrote:
> 
> [Christopher Schmidt]
> > Right. The data which is now 'wrong' in the dump is actually correct in
> > the db. Data which is right in the dump is in the db as iso-8859-1, but
> > the current planet exporter converts *everything* from latin-1 to
> > utf-8... even if it was already right. (oops.)
> 
> I guess the intended behavior is to store UTF-8 in the database and
> export UTF-8 in the dump.  For that to work properly all the database
> entries with ISO-8859-1 will need to be converted.  It should be

Actually no, because the API deals things out correctly. MySQL can and
does on-the-fly collation and spits out UTF8 correctly on the API. The
puzzle is why (using exaclty the same code on a different machine) it
doesn't in the planet dump. I have yet to check the mysql client
versions are the same etc, I will now. (This we before the explicit set
utf8 call patch to the planet dumper).

I havn't checked yet whether an ALTER TABLE statement on a column of one
charset to another will magically do the right thing. If it does then
that would be the easiest solution.

have fun,

SteveC steve at asklater.com http://www.asklater.com/steve/




More information about the dev mailing list