[OSM-dev] Suggestion to replace created_by tags
richard at systemeD.net
Sat Apr 28 13:37:19 BST 2007
I'm about to make a post about the data model. Please shoot me now.
At present, some editors tag everything they do with the created_by
tag. So if you draw and upload a way using JOSM, the way will be
tagged as created_by=JOSM, as will its constituent segments, as will
There are a handful of issues with this.
1. Not all clients do it, and because it's a tag, it's not enforced.
So you might have something tagged 'created_by=JOSM' which has
subsequently been modified by another editor or a script. This
defeats the point of the tag...
2. ...and means there's no effective versioning. You can't strip out
edits by a dodgy client if the data is still tagged as
3. It makes "untagged" data appear tagged. It would be really handy
to be able to do SELECT * FROM current_nodes WHERE (latitude BETWEEN
a AND b) AND (longitude BETWEEN c AND d) AND tags IS NOT NULL - in
other words, find all the POIs within a bounding box in one easy
query. created_by prevents this.
4. It's an extra burden on the database.
I'd suggest that we get rid of the created_by tag: and, instead,
introduce a new 'client id' into the XML message body spoken by the API.
This would be a unique identifier for the client (JOSM, coastline
script, Potlatch, whatever) making this particular edit. For
efficiency, I'd suggest it could be numeric (with a dictionary on the
wiki), and could potentially also include a version number. So 2.80
might indicate the node was created/modified by version 80 of the
coastline script. That way, if a bug appears in version 81 (and that
alone) which generating corrupt data, it can easily be removed.
On the database, this would then be stored in a new column in the
nodes, segments and ways tables. (There'd be no need to include it in
current_nodes, current_segments and current_ways, which are the most
frequently read.) Because a new row is created for each edit, we
would then have full versioning.
This should be a pretty easy change to implement, perhaps as part of
the 0.4 API, and could potentially save us data problems in the future.
More information about the dev