[OSM-dev] UTF8 problem with last night's daily .osc

Frederik Ramm frederik at remote.org
Fri Aug 29 16:52:21 BST 2008


Frederik Ramm wrote:
> Closer inspection reveals that this is a tag value that has been 
> truncated at character #255, which happens to be in the MIDST of an 
> UTF-8 sequence. Ouch! Who truncates tags to 255 characters?

It's a bit embarassing to keep talking to myself here but in case anyone 
else is interested:

The culprit is way #26604650 which was newly created with Potlatch 
0.10b, apparently with the tag value being truncated in the middle of an 
UTF-8 sequence, which makes any Osmosis processing of the resulting diff 
files (and probably also planet files?) impossible (the parser aborts).

It also seems to create intermittent problems with the API. Just now 
(check email headers for exact time) about one in five requests of

$ wget -O- http://www.openstreetmap.org/api/0.5/way/26604650/history

fails with an "internal server error", while the same number of requests 
against randomly selected other ways work fine all the time. Is this 
strange or what?


