[Talk-ca] Internal CanVec conflicts

Dan Charrois dan at syz.com
Sun Nov 11 06:16:36 GMT 2012


> "Is it the communities view that it is okay to import CanVec without
> reconciling the internal differences between the layers?"
> 
> I believe it is.  The great thing about OSM data is it is not written in
> stone.  An import or edit can be changed in the future.  The data is
> inserted for use by anyone.  Just because I upload the CANVEC does not mean
> it is required to stay there as is.    I don't believe an import has to be
> perfect, especially in massively expansive areas natural areas which remain
> in a constant state of flux.   This cannot be accurately determined via Bing
> or ANY source we have.  Rivers change course each year, often several times.
> They flood timber, often for short periods of time. Which raises another
> question...how do we determine what timber is?  Is it trees?  Brush?  Mixed
> wood?  A forester would say all of the above.  Muskeg?  Swamp? Bog?  Anyone
> here qualified to make that decision?   That island in your example?  It has
> brush on it.  But it might not in the spring.  The whole island might not be
> there in the spring.  But it's there right now.  


As a fellow Canvec importer, I wanted to weight in on this discussion with my opinion as well.

I agree with Bryan's viewpoint.  In an ideal world, it would be great if we could process an area of Canvec data and be able to say that it absolutely and accurately reflects a current reflection of reality.  Perhaps if we were in a (much) smaller country, or if we had a (much) larger community of OSM mappers here, getting closer to this ideal would be easier.  But the truth of the matter is that with ten million square kilometres of land to map, and only a small handful of people doing it, the question naturally arises as to whether it is better to have a very small area of Canada mapped extremely well, or a larger area merely adequately.

This isn't as much an issue in the larger cities as it is in rural or remote areas.  In larger cities, there is a larger community of OSM mappers out there who keep them up to date, consistent, and accurate, and I think in general those who have worked in those areas have done a wonderful job.  That's a great thing - our best maps in Canada are in locations where there are the most people to take advantage of them.

I started contributing to OSM data via Canvec imports based on need - areas I was interested in had a rather outdated road network, a very minimal hydrological network, and no information on forested or wooded areas at all.  Canvec data, though not perfect or always internally consistent, at least was much better than what was already there.

My first imports were sloppy, as any first attempts always are.  I didn't know about joining features at edges of tiles, and in general placed a lot more "authority" on Canvec data than it should have sometimes received.    I even discovered a bug in JOSM that caused me to accidentally delete some roads that shouldn't have been (which was fortunately pointed out to me fairly early so I didn't continue wrecking things as I went along trying to improve them).  But I learned over time, and hopefully got better, in learning where Canvec's strengths and weaknesses were.

Over time, I've come to realize how certain assumptions could be made in Canvec data.  If roads for a new subdivision appear to be placed in a wooded area, there is a pretty good chance that the wooded area is no longer there.  Similarly, for a road going through a small pond - the pond is likely based on older data than the road and likely disappeared when the road was constructed.  I usually assume that if a road already exists in OSM for an area where Canvec has a road, the existing road could be very well based on better data than Canvec (and on the other hand, if Canvec has data for a road which doesn't exist in OSM, I usually add it under the assumption that it had just not been mapped yet into OSM).  If Bing data exists to verify this, great... but at least in the parts of the country I'm interested in, it very rarely does.  And do I miss things and make mistakes?  Absolutely!  But I strive to add more value to OSM than I take away by failing to fix inconsistencies like this.

Ultimately, as Bryan said, OSM data isn't written in stone, and anyone finding an error in a Canvec import is welcome to change it, and is every bit as capable of changing it as the original importer.  Some Canvec data may have trees in waterways - if the original importer finds and fixes this, great.  If not, the next person to come along could do so just as easily.  But starting out with at least some data in OSM in the area is better than nothing at all.  Take, for example, the case of a stream running through an area that becomes a new subdivision.  If Bing imagery exists, great - but it often doesn't.  But there is no way to tell otherwise and without knowledge "on the ground" if it still exists after the subdivision is complete - larger streams may still exist, whereas smaller ones often just "fade away".  If someone comes along after the fact, it is immensely easier for them to just delete a stream if it no longer exists, than for them to plot out the course of the stream if it was never entered in the first place, because the Canvec importer was worried that they didn't know if it still existed.  Maps, whether they be OSM, Google, or printed on paper, are ALWAYS out of date and contain errors.  Being able to fix these errors are what makes OSM so great.

The areas I'm interested in are often fairly remote, and I may be the only person who even sees or cares about the specific data in those areas.  Of course, this means that any errors I introduce by implementing Canvec data relatively blindly may stay there for quite some time.  But I take the strong viewpoint that something is better than nothing.  If I'm hiking along and hoping to come across a stream entered into OSM, I understand that in reality it may or may not be there.  But at least that's better than having no idea whatsoever if there is even the potential for a stream around anywhere in the area.

So my very strong viewpoint on this is that YES, it is absolutely okay to import Canvec data without reconciling the internal differences between the layers.  If you can, of course, any reconciliations or verifications that can be made make the import even BETTER (and as you do more of this, you'll find time-saving shortcuts to make the job easier).  But if you don't have the time or resources to do better than what Canvec provides, everything helps, and you're still improving the map.

Of course, there will always be differing opinions (and that's a good thing).  We all want the same thing in the end, though there may be different ideas on how best to get there.  But I want to encourage people (especially new mappers) to not be afraid to contribute what they can.  Nobody expects things to be perfect.  Over time, the data (and people's skillsets) continue to improve.  But I don't want to see anyone dissuaded from contributing in fear that they may not be doing a "good enough" job.

Dan
--
Syzygy Research & Technology
Box 83, Legal, AB  T0G 1L0 Canada
Phone: 780-961-2213




More information about the Talk-ca mailing list