[OSM-dev] Problem with 2010/03/10 planet...

Frederik Ramm frederik at remote.org
Fri Mar 12 09:55:12 GMT 2010


Hi,

Jon Burgess wrote:
> It looks like this was caused by a change made by Frederick back in
> r19176. The planet dump code used to turn all characters less than 32
> into '?' instead of creating these character sequences. I guess he
> didn't read the bit of the XML spec which says that all characters <32
> are invalid except for tab / newline / carriage return[1]

Probably right, I didn't read *any* of the XML spec ;-) I cannot 
remember why I made that change, I guess there must have been some 
reason but maybe it was a mistake altogether. I am sorry for the 
inconvenience.

If you have an uncompressed version of the planet file, the XML bugs can 
be fixed using the following three incantations:

echo -n '          v="'|dd bs=10 seek=6126462692 conv=notrunc of=planet.osm
echo -n '     v="'|dd bs=10 seek=6533550293 conv=notrunc of=planet.osm
echo -n '          "backw' | dd bs=10 seek=13047657759 conv=notrunc 
of=planet.osm

Make sure to place the name of your planet file in the of= parameter and 
make sure that the number of spaces is exactly as written above.

If you are streaming the file, then you could use sed to remove any 
occurrence of "&#2.;" or use grep -v to remove all lines containing

Meycauayan City Northbound Entry Point

and

<member type="node" ref="494163268" role="backward_stop"/>

Bye
Frederik




More information about the dev mailing list