[OSM-dev] Empty Keys, possible planet.c problem?

Jon Burgess jburgess777 at googlemail.com
Sun Sep 30 16:34:25 BST 2007


On Sun, 2007-09-30 at 16:26 +0100, Jon Burgess wrote:
> On Sun, 2007-09-30 at 16:53 +0200, Frederik Ramm wrote:
> > Hi,
> > 
> > > > If it's not too much work cleaning up always sounds best.
> > > 
> > > Looks like this effects 431 entries:
> > > 
> > > $ bzip2 -dc planet-070919-070928.diff.xml.bz2 | grep -c k=\"\"
> > > 431
> > 
> > Funny that I only found that little then, maybe something was wrong
> > with my approach. 
> > 
> > > A few more than I'd be willing to do by hand but it should be possible
> > > to extract the IDs, download the objects and upload the fixes.
> > 
> > What I did to dump the broken objects was I used a "grep" like you did
> > but added something like "-A20 -B20" to get 40 lines of context, than
> > ran the following script on the results:
> 
> That script seems to work fine. I ran it on the planetdiff output and it
> generated the list of 431 objects. Interestingly they are all nodes. 
> 
> I've put a copy of the 20kB compressed output at
> http://www.jburgess.uklinux.net/empty-keys.txt.gz
> 


369 of them reference 'hostel' which looks like a bulk import.

I think what happened with these is that one of the values in contained
';' characters which causes problems because the API converts the
key/value pairs into a single string using ';' as a separator:
"key=value;key=value;...". When it comes to split the string again any
';' looks like the start of a new key

Hence you get strange looking things like:

    <node id="23655680" lat="51.1835100" lon="14.4241600" timestamp="2007-01-01T13:39:11+00:00">
      <tag k="" v="" />
      <tag k=" " v="" />
      <tag k="name" v="JGH Bautzen" />
      <tag k="created_by" v="JOSM" />
      <tag k=" 0049-(0)3591-40347" v="" />
      <tag k=" Am Zwinger 1" v="" />
      <tag k="tourism" v="hostel" />
      <tag k=" Ganzjährig." v="" />
      <tag k=" Bautzen" v="" />
      <tag k=" 02625" v="" />
    </node>

The problem only occurs for nodes and segments since the ways use a
different table to store key/value pairs directly.

  Jon






More information about the dev mailing list