[OSM-talk] Planet Dump

Jonas Svensson jonass at lysator.liu.se
Fri Mar 9 11:40:39 GMT 2007

On Fri, 9 Mar 2007, Keith Sharp wrote:

> What impact do error reports from UTF8Sanitize have on the output XML
> file?  With the latest planet.osm I am getting:

> Is it safe to continue or do I need to investigate these errors further?
> Keith.

Broken UTF8-characters in the input are replaced by a "_" in the output.
So the output should be safe to feed something else which is more strict
about the XML. However, do not
forget that the content is changed. Usually the effect is that a
streetname or two somewhere in the world gets changed
but still is recognizable for those with knowledge about the area.

I try to upload the report to
<http://wiki.openstreetmap.org/index.php/Utf8_errors> and there you see
links to use tools to fix the errors (if you know the correct name).


