[OSM-talk] invalid UTF8 characters redux

Tom Hughes tom at compton.nu
Fri Jul 6 16:17:09 BST 2007

In message <468E5691.8080309 at frankieandshadow.com>
          David Earl <david at frankieandshadow.com> wrote:

> Some months ago I corrected all the tag values which had invalid UTF8
> characters in them. I'm pleased to see that in processing the planet
> file every week since then, no more have appeared.
> This may just be luck. On the other hand the rails port happened about
> the same time, and I'm wondering if the api would actually reject
> invalid UTF8 in the uploaded XML - most XML parsers seem to.
> If this is the case, we can dispense with the 'sanitize' program for
> removing bad UTF8.
> Can anyone confirm or otherwise?

No idea, but it is quite possible.

Do you have an example of something that should fail that I can test
against the API?


Tom Hughes (tom at compton.nu)

More information about the talk mailing list