[OSM-dev] UTF8 problem with last night's daily .osc
frederik at remote.org
Fri Aug 29 16:52:21 BST 2008
Frederik Ramm wrote:
> Closer inspection reveals that this is a tag value that has been
> truncated at character #255, which happens to be in the MIDST of an
> UTF-8 sequence. Ouch! Who truncates tags to 255 characters?
It's a bit embarassing to keep talking to myself here but in case anyone
else is interested:
The culprit is way #26604650 which was newly created with Potlatch
0.10b, apparently with the tag value being truncated in the middle of an
UTF-8 sequence, which makes any Osmosis processing of the resulting diff
files (and probably also planet files?) impossible (the parser aborts).
It also seems to create intermittent problems with the API. Just now
(check email headers for exact time) about one in five requests of
$ wget -O- http://www.openstreetmap.org/api/0.5/way/26604650/history
fails with an "internal server error", while the same number of requests
against randomly selected other ways work fine all the time. Is this
strange or what?
More information about the dev