[OSM-dev] UTF-8 errors in our DB, or elsewhere?

Grant Slater openstreetmap at firefishy.com
Tue Feb 12 02:27:01 GMT 2008

Frederik Ramm wrote:
> in the course of producing shapefiles, I applied the libxml2 built-in
> character set conversion from UTF-8 to Latin-1 to our tag values, and
> found a lot of problems (about 20k nodes/ways) where it complained.

UTF-8 to Latin-1 is lossy, that is likely what caused most of the 

> Here's a list of objects that libxml2 complained about (not complete
> as I didn't process a full planet):
> http://www.remote.org/frederik/tmp/utf8.txt

A quick view of this document in Firefox, forcing the character encoding 
(View->Character Encoding->UTF 8), most of the characters seem to 
display as expected.

/ Grant

More information about the dev mailing list