[OSM-dev] Suggestion to replace created_by tags

Ed eAi at opencoding.net
Sat Apr 28 13:50:49 BST 2007


I'm very new here, but heres my thoughts:

I agree with you in general, but I'd suggest that using a client id is a 
Bad Thing. I'd suggest using a User Agent, like HTTP (and many other 
protocols) uses. This is clearly marginally harder for the database to 
cope with, but it means that anyone can write their own user agent, 
without having to assign themselves an id. Equally, User Agents can then 
display information about the user agent that created the segment/node 
etc, without having to know what all the IDs refer to.

So, I'd suggest having User Agents like:
JOSM 20070428
or
Coastline v80

Perhaps clients like JOSM could even add information about the loaded 
plugins to the User Agent string, e.g. 'JOSM 20070428 (osmarender, 
validator)'.

So, in summary - I'd advise against using pure IDs, as it lacks 
flexibility and is hard for humans to read (and hard for computers to 
make human readable too).

Ed

Richard Fairhurst wrote:
> I'm about to make a post about the data model. Please shoot me now.
>
>
> At present, some editors tag everything they do with the created_by  
> tag. So if you draw and upload a way using JOSM, the way will be  
> tagged as created_by=JOSM, as will its constituent segments, as will  
> its nodes.
>
> There are a handful of issues with this.
>
> 1. Not all clients do it, and because it's a tag, it's not enforced.  
> So you might have something tagged 'created_by=JOSM' which has  
> subsequently been modified by another editor or a script. This  
> defeats the point of the tag...
>
> 2. ...and means there's no effective versioning. You can't strip out  
> edits by a dodgy client if the data is still tagged as  
> 'created_by=JOSM'.
>
> 3. It makes "untagged" data appear tagged. It would be really handy  
> to be able to do SELECT * FROM current_nodes WHERE (latitude BETWEEN  
> a AND b) AND (longitude BETWEEN c AND d) AND tags IS NOT NULL - in  
> other words, find all the POIs within a bounding box in one easy  
> query. created_by prevents this.
>
> 4. It's an extra burden on the database.
>
> I'd suggest that we get rid of the created_by tag: and, instead,  
> introduce a new 'client id' into the XML message body spoken by the API.
>
> This would be a unique identifier for the client (JOSM, coastline  
> script, Potlatch, whatever) making this particular edit. For  
> efficiency, I'd suggest it could be numeric (with a dictionary on the  
> wiki), and could potentially also include a version number. So 2.80  
> might indicate the node was created/modified by version 80 of the  
> coastline script. That way, if a bug appears in version 81 (and that  
> alone) which generating corrupt data, it can easily be removed.
>
> On the database, this would then be stored in a new column in the  
> nodes, segments and ways tables. (There'd be no need to include it in  
> current_nodes, current_segments and current_ways, which are the most  
> frequently read.) Because a new row is created for each edit, we  
> would then have full versioning.
>
> This should be a pretty easy change to implement, perhaps as part of  
> the 0.4 API, and could potentially save us data problems in the future.
>
> cheers
> Richard
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>   




More information about the dev mailing list