[Talk-ca] which tags in canvec? was: canvec2osm update (an easy question this time)

Richard Weait richard at weait.com
Thu Jun 25 21:19:41 BST 2009


On Wed, 2009-06-24 at 22:56 -0700, Sam Vekemans wrote:

[ ... a lot of stuff ... ]

> Anyway, i uploaded a couple sample features to Port Renfrew, BC
> 
> http://www.openstreetmap.org/browse/relation/163132
> 
> 
> What tags should be removed??... but more importantly, WHY.. 

Hi Sam (and list),

Others have suggested that you are including too much from canvec in the
import.  I agree.  Much of what you are importing can be dropped without
hurting OSM.  

These should stay.  

created_by = canvec2osm 
landuse = residential 
source = CanVec_Import_2009 
attribution = Natural Resources Canada
canvec:UUID = 11CF43A8C213E5F4E0409C8467120387

Sam, you say on the canvec2osm page[1] that v0.74 is the latest
canvec2osm zip file.  Some older versions are found at your site[2] but
not v0.74 or several others.  It also looks like you have started
uploading sample area .osm files with similar names to the script.zip
files.  Confusing!  

I think most of what you are putting into the sample[3] should be
removed and can be safely removed.  Here's what I've done:  

I've run the canvec2osm V0.22 script for a large portion of southern
Ontario.  This created over 1,300 files.  

Then I looked for unique data in each of the tags.  For example I found
that in over 1300 files "canvec:PROVIDER" had only five values and two
were duplicates.

canvec:PROVIDER = Federal
canvec:PROVIDER = federal
canvec:PROVIDER = municipal
canvec:PROVIDER = Provincial/territorial
canvec:PROVIDER = provincial_territorial

This adds almost zero value to OpenStreetMap and it would be damaging to
OSM to include this data in every item imported from CanVec.  I can't
imagine that a large number of OSM users would care if data came from
the town, province or federal government for each node and way.  

These should stay.  They are appropriate and useful to OSM users and
tools.  

created_by = canvec2osm 
landuse = residential 
source = CanVec_Import_2009 
attribution = Natural Resources Canada
type = multipolygon

These should be removed.  The tags above tell those interested that the
data came from CanVec.  If they need to know more, they can find their
way through the wiki and svn.  Lots of duplication here.  

canvec:CODE = 1370012 
canvec:datasetName = 092C09  
canvec:generic_code = 1370009 
canvec:min_size:CODE = 1370009 
canvec:source = CanVec_Feature_Catalogue_Edition_1_0_2.pdf 
canvec:entity = Residential area - ( Zone résidentielle )
canvec:value = Residential area - ( Zone résidentielle )
canvec:Theme = BS Buildings and structures

No canvec:source, just "no".  This tag appears over 385,000 times in my
sample area.  The value is always
"CanVec_Feature_Catalogue_Edition_1_0_2.pdf"  No way.  Put it in the
wiki.  The only folks likely to care are the ones who are working on the
import.  

Next was
canvec:Planimetric Accuracy (CMAS) 

First, "canvec:Planimetric Accuracy (CMAS)" as a key is broken.  Keys
must not include spaces.  Second, I think it should be dropped from the
import even if the key is fixed.  

In over 1300 files the only values for this key were: 

-1,0,3,5,10,21 and 30.

Not much to choose here.  And not much to learn from adding this tag to
every node and way.  I say drop it entirely OR use k=canvec:accuracy,
v=value and only include it for the worst of the data, like values >=21
meters.  That would add value for OSM users by making it obvious that an
item could possibly be improved by a consumer-grade GPS with a good fix.
>From my sample only 45 objects out of ~400,000 have these poor accuracy
values.  

Or, alternately, drop any data with accuracy >=21 meters and don;t
include it in OSM.    

canvec:VALDATE is similar.  Best would be only to include valdate when
valdate is older than ten years, as something that an OSM mapper could
reasonably bring up-to-date.  Or just don't import anything older than
ten years old.  But I say drop VALDATE entirely, but I'm willing to be
convinced otherwise.  

And what is this stuff?  Details on how they classified the data when
they collected it?  And did CanVec really misspell "tolerance" twice?
This is not adding value in the OSM database.  Leave it in the wiki or
let people track it down in the canvec documents if it is important to
them.  

Drop all of these:
canvec:min_size:area_sq_meter = 1000 
canvec:min_size:lat_distance_meter = 1.5 
canvec:min_size:length_meter = --- 
canvec:min_size:long_distance_meter = 3
canvec:min_size:right_angle_tollerance_degree = ---
canvec:min_size:spike_angle_tollerance_degree = 10

Best regards,
Richard

[1] http://wiki.openstreetmap.org/wiki/Canvec2osm
[2] http://www.acrosscanadatrails.com/Home/
[3] http://www.openstreetmap.org/browse/relation/163132











More information about the Talk-ca mailing list