[OSM-dev] UTF8 problem with last night's daily .osc

Brett Henderson brett at bretth.com
Sat Aug 30 08:23:43 BST 2008


Frederik, I just saw your emails but I'm about to head out and can't 
look at it now, I'll try to take a look tomorrow but it sounds like 
you've already diagnosed the problem.

If you can organise for the problematic way to be deleted or fixed in 
the database I'll re-create all changesets since that point (I just have 
to modify the timestamp file and kick off the extract again).  A DB 
update will presumably require TomH, I don't have write access (and 
would rather not touch it if I did).

Frederik Ramm wrote:
> Hi,
>
> Frederik Ramm wrote:
>   
>> Closer inspection reveals that this is a tag value that has been 
>> truncated at character #255, which happens to be in the MIDST of an 
>> UTF-8 sequence. Ouch! Who truncates tags to 255 characters?
>>     
>
> It's a bit embarassing to keep talking to myself here but in case anyone 
> else is interested:
>
> The culprit is way #26604650 which was newly created with Potlatch 
> 0.10b, apparently with the tag value being truncated in the middle of an 
> UTF-8 sequence, which makes any Osmosis processing of the resulting diff 
> files (and probably also planet files?) impossible (the parser aborts).
>
> It also seems to create intermittent problems with the API. Just now 
> (check email headers for exact time) about one in five requests of
>
> $ wget -O- http://www.openstreetmap.org/api/0.5/way/26604650/history
>
> fails with an "internal server error", while the same number of requests 
> against randomly selected other ways work fine all the time. Is this 
> strange or what?
>
> Bye
> Frederik
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>   





More information about the dev mailing list