[Talk-ca] OSM data quality in Canada

Paul Norman penorman at mac.com
Thu Jun 18 21:14:56 UTC 2015


On 6/17/2015 1:12 PM, Martijn van Exel wrote:
> * What is the imports history, particularly in relation to road network, POIs and addresses? (Beyond what’s in the import catalogue page on the wiki, if anything)
CanVec, National Hydrographic Network (NHN), and National Road Network 
(NRN), all out of Natural Resources Canada (NRCan).

CanVec is a product supplied in .osm format composed of multiple 
government datasources, including the NHN and NRN. The sources used vary 
by region, so what is true somewhere may not be true elsewhere.

> * What external (government and otherwise) open geospatial data sources are out there that have been or may be considered for improving OSM?
There is probably an equivalent to TIGER address ranges that should be 
used by a geocoder as a fallback in the same manner.

I'm not aware of anything really under consideration. Data released by 
the federal government under their OGL variant is okay license-wise, but 
the same is not always true for the provincial and municipal data.
> * Are there any Canada-specific mapping and tagging conventions?
Because roads are largely the responsibility of provinces, road 
classification varies province by province.
> * Are there any known big (national) issues in the Canadian OSM data? (misguided imports / bots, major tagging disputes, that kind of thing)
CanVec has left parts of the country a colossal mess. I would say the 
forest/water data is the worst, often coming from different sources from 
the 70s, and these sources often do not agree with each other. When 
faced with 40 year old imported landcover data that doesn't resemble 
reality, the best option is often to just delete it.

There are some regional quirks with CanVec. These include

- Poor alignment of water or trees with each other
- Forests on what are now residential areas
- Incorrect surface or lanes values
- Invalid housenumbers (-1)
- Interpolation used for what should be a single number
- Interpolation where there aren't roads in the data
- Extra spaces in some road names
- Unclassified roads tagged as residential

NRN and NHN were less wildly imported. Not having landcover, they don't 
have those problems, but do have some of their own

- Incorrect surface or lanes values (NRN)
- Lots of tag cruft (Both)
- Badly overnoded streams (NHN)
- Streams with oneway (NHN)
- Non-standard tagging (NHN)

> * Which (other) companies / organizations / government agencies use OSM data for Canada?
NRCan used to use CanVec and OSM matching to find locations missing in 
their dataset, but I'm not sure if they do this anymore.
> * Any suggestions for QA tools that would help the community, either existing or new?
Beyond the standard international ones, I'm not sure. The incorrect 
surface, lanes, housenumbers, and extra spaces are probably all amenable 
to a mechanical edit rather than a QA tool. Some headway has been made 
with mechanical edits. The tag cruft will remove itself over time as 
people edit the objects.

Overlapping water/trees from CanVec are so easy to find, and I'm not 
sure a QA tool is the best choice where the time to fix hugely outweighs 
the time to find.

Address interpolation indicating roads where there are no roads is an 
interesting one, and might be suitable to a QA tool.



More information about the Talk-ca mailing list