[OSM-talk] invalid UTF8 characters redux

David Earl david at frankieandshadow.com
Fri Jul 6 15:49:53 BST 2007


Some months ago I corrected all the tag values which had invalid UTF8 
characters in them. I'm pleased to see that in processing the planet 
file every week since then, no more have appeared.

This may just be luck. On the other hand the rails port happened about 
the same time, and I'm wondering if the api would actually reject 
invalid UTF8 in the uploaded XML - most XML parsers seem to.

If this is the case, we can dispense with the 'sanitize' program for 
removing bad UTF8.

Can anyone confirm or otherwise?

David




More information about the talk mailing list