[OSM-dev] Mixed character encoding in planet.osm - plan for fixing it

Jonas Svensson jonas at mozoft.com
Wed Nov 8 11:34:53 GMT 2006

On Wed, 8 Nov 2006, raphael Jacquot wrote:

> well... if you have mostly correct UTF-8 in the planet dump, except for a few 
> hundred entries, as the utf8sanitizer shows, then the script generating the 
> file is probably working correctly and some entries in the database have to 
> be wrong, it's as simple as that...
> these for instance are good examples :

Yes, you are right. Thank you for pointing at them. So now we know that 
some names are good UTF-8 in planet dump and some are broken. However 
I have still not seen any broken UTF-8 when retrieving nodes by the rest 


