[OSM-dev] Suggestion to replace created_by tags

Richard Fairhurst richard at systemeD.net
Sat Apr 28 13:37:19 BST 2007


I'm about to make a post about the data model. Please shoot me now.


At present, some editors tag everything they do with the created_by  
tag. So if you draw and upload a way using JOSM, the way will be  
tagged as created_by=JOSM, as will its constituent segments, as will  
its nodes.

There are a handful of issues with this.

1. Not all clients do it, and because it's a tag, it's not enforced.  
So you might have something tagged 'created_by=JOSM' which has  
subsequently been modified by another editor or a script. This  
defeats the point of the tag...

2. ...and means there's no effective versioning. You can't strip out  
edits by a dodgy client if the data is still tagged as  
'created_by=JOSM'.

3. It makes "untagged" data appear tagged. It would be really handy  
to be able to do SELECT * FROM current_nodes WHERE (latitude BETWEEN  
a AND b) AND (longitude BETWEEN c AND d) AND tags IS NOT NULL - in  
other words, find all the POIs within a bounding box in one easy  
query. created_by prevents this.

4. It's an extra burden on the database.

I'd suggest that we get rid of the created_by tag: and, instead,  
introduce a new 'client id' into the XML message body spoken by the API.

This would be a unique identifier for the client (JOSM, coastline  
script, Potlatch, whatever) making this particular edit. For  
efficiency, I'd suggest it could be numeric (with a dictionary on the  
wiki), and could potentially also include a version number. So 2.80  
might indicate the node was created/modified by version 80 of the  
coastline script. That way, if a bug appears in version 81 (and that  
alone) which generating corrupt data, it can easily be removed.

On the database, this would then be stored in a new column in the  
nodes, segments and ways tables. (There'd be no need to include it in  
current_nodes, current_segments and current_ways, which are the most  
frequently read.) Because a new row is created for each edit, we  
would then have full versioning.

This should be a pretty easy change to implement, perhaps as part of  
the 0.4 API, and could potentially save us data problems in the future.

cheers
Richard




More information about the dev mailing list