[Imports] Ongoing Canadian building import needs to be stopped, possibly reverted

John Whelan jwhelan0112 at gmail.com
Fri Jan 18 00:55:23 UTC 2019

The import was discussed on talk-ca and in my opinion there was a 
consensus of opinion it should go ahead. The data comes from the 
municipalities of which there are some 37,000 separate ones in Canada.  
The idea of a single import plan was suggested on talk-ca by someone not 
involved rather than have 37,000 different import plans.  Many 
municipalities are very small.

If you look at Canada there are locations with poor imagery and lots of 
buildings and my suspicion is some were being imported without the 
licenses really meeting the OSM licensing requirements.

Many locations in Canada do not have a group of OpenStreetMap local 
mappers.  Many locations do. Locating local groups is difficult.  My 
understanding is it has always been it is the local community who make 
the decision to do an import. We did use talk-ca and if you wish to 
continue the conversation then I would suggest that is the place to hold 
the conversation.  I think the problem here is defining the ideal local 
community.  It is probably somewhere between the whole of Canada and 
each individual municipality and contains groups of local mappers.

The intent is certainly to involve local mappers, building outlines by 
themselves are especially interesting but building outlines with added 
tags are much more interesting.  We need boots on the ground, we need 
local mappers onside.

Cheerio John

Nate Wessel wrote on 2019-01-17 7:08 PM:
> Hi everyone,
> I've had a bit of time today to gather my thoughts on this import and 
> I hope I can offer something more productive to the discussion now. 
> First, I want to apologize to the importers for the panicked tone of 
> my initial email and private communications. I saw after a long day 
> that the buildings were literally one task away from completely 
> swamping my own neighborhood, and I hope it's understandable that I 
> felt pretty defensive about it, having put so much time into my own 
> little corner of the city over the years. So, I want to thank you all 
> for taking that in stride, and especially for agreeing to stop the 
> import while we discuss the issues I raised. If I came off as harsh or 
> unappreciative, please be sure that I didn't mean to. We're all 
> volunteers here and I know first-hand how much work goes into doing 
> something like this. I'm actually one of the lead mappers for a 
> building import in my hometown at the moment - I'm not opposed in any 
> way to imports of buildings if they're done right.
> But I've also spent way too much time cleaning up bad import data - 
> whether it's TIGER imports from way back when or more recently the 
> disturbingly sloppy address ranges that showed up last year in 
> Toronto. In my experience, it takes so much less time to get this 
> right in the first pass than it does to clean up the damage months or 
> years later when we realize some mistakes were made or the data could 
> have been handled better.
> There have been a lot of responses to some of the specific things I 
> said, so instead of replying inline, let me try to rephrase the big 
> issues as I see them with some of the new perspective and information 
> in mind.
> A ) This import, essentially, did not get approval from the imports 
> list. While an email was sent, I think that it was so vague and 
> misdirected (surely with no nefarious intent) that it would be hard or 
> impossible for a casual subscriber to the list to understand the scope 
> of the project. Without having understood the scope of the project, 
> which is utterly huge, the import plan was not given adequate 
> scrutiny. This is evidenced by the relative lack of discussion.
> B ) I didn't know this was going on until I saw it happening. While my 
> personal knowledge is obviously not a necessary precondition for 
> successful imports, I do feel it may be a sign that the scale of this 
> effort is wrong for the task at hand.
> While the technical details and any processing of the data are 
> probably best handled at the national level, since it all comes from 
> the same source and presumably has the same technical hurdles to 
> overcome, I can't imagine that the whole country can be asked whether 
> it wants buildings to be imported or not, or what concerns and 
> requirements would come attached to such an import. There will be so 
> much local variation and I think that just has to happen at a more 
> local level. If that local effort had been made, I'd be surprised if I 
> never heard about it. Rather than attempt to notify all Canadian 
> mappers, would it be too much to ask that this might go province by 
> province or city by city? If I had seem 'Toronto' or 'Ontario' 
> anywhere on this mailing list, you can be sure my ears would have 
> pricked up right quick.
> C ) This import is going way too fast - there is simply no way three 
> people could have carefully imported as much data as has been imported 
> in the time since this started. Like I said, I'm working on an import 
> myself and it's long, tedious, and strangely satisfying work when 
> you're doing it carefully. In my opinion, these task squares are 
> simply ten times too large at least. When I said above that my 
> neighbrhood would be swamped by the next task, I really mean swamped. 
> 90% of the places I go in Toronto fit inside a single task. The 
> tasking manager we're using for the building import in Hamilton County 
> allows one to upload custom task geometries. I got a bit silly with 
> the task shapes perhaps (https://tasks.openstreetmap.us/project/107) 
> but I think the size is about right - importing 500-1000 building 
> footprints should take ~10-30 minutes, with a careful check of the 
> imagery, a check with JOSM's validation tool, a second validation 
> after native OSM data has been merged with the import data... I would 
> never attempt a task as large as the smallest task here, and I do not 
> think that reflects poorly on my abilities or experience. If the 
> tasking manager doesn't allow smaller tasks then it is the wrong tool 
> for the job.
> I have several specific technical issues with / questions about the 
> data that are probably best addressed in some other forum, like on the 
> wiki. If I may, I'd like to save those for the moment, because I think 
> I see a productive way to keep moving forward with things while we 
> discuss.
> The data needs to be carefully and thoroughly validated at some point, 
> right? May I suggest that everyone stop importing new data and engage 
> themselves in cleaning and validating the data that has already been 
> brought in, neighborhood by neighborhood? There is plenty to keep us 
> all busy for weeks. While doing that, let's make a list of issues that 
> we come across and discuss ways that they can be addressed before any 
> new buildings are brought in. We can take this as a learning 
> experience and make the rest of this import process better.
> I have the feeling that some will feel this is redundant - wasn't the 
> Ottawa import the test run? My response has to be that the data and 
> the process are not yet as good as then can and should be, so another 
> round of trials and iterative improvement is needed before this rolls 
> out a mari usque ad mare.
> With all due respect, patience, and humility,
> Nate Wessel
> Jack of all trades, Master of Geography, PhD candidate in Urban Planning
> NateWessel.com <http://natewessel.com>
> On 1/17/19 3:13 PM, OSM Volunteer stevea wrote:
>> Thank you, John.
>> On Jan 17, 2019, at 11:22 AM, john whelan<jwhelan0112 at gmail.com>  wrote:
>>> First if you look at the 2020 wiki page history you'll see there is a lot of input from Steve.  My concern with this very detailed input is it made it hard for a new person to quickly locate relevant information, an overview if you like.
>> I encourage an "Overview" section or what some call a "Quick Start."  For some (experienced OSM mappers), this could suffice for "jumping in right now."  However, there is no shortcut for anybody involved in the importation of these data to read every single word of the wiki.  If wiki words aren't relevant, they either weren't in the right wiki or they could have and should have been deleted.  As I wasn't sure of the actual direction of the project, I added what I thought would help.  I would much rather have there be more (extraneous, even) guidance and instruction which later got deleted as superfluous than not enough and leave volunteers with more questions than answers.  Call this a failure to edit the wiki properly, though not on my part.
>>> I will confess that there have been small groups in face to face meetings in small cafes where you need a password to logon to the internet.  He was not specifically invited to them all.
>>> I confess we have used conference calls and other methods of communication without notifying hundreds of people first.  There have even been meetings that I was unaware of.  For example I haven't even communicated directly with the mappers who are doing most of the import at the moment.
>>> There has even been at least one mapathon that Stats Canada only found out about after the event.
>> I believe what is being said or conveyed here is that decentralized discussion preceding data input "happens."  Sure, it does, that is part of a planning process and not all of these are "widely open to all of OSM," nor should they be, nor must they be.  So, largely, "we agree" though I'm puzzled at your use of the verb "confess."  Largely speaking, it is the degree to which openness happens in OSM (or the spirit of moving it in that direction, especially when identified as "we need more here") which is important, not specific cases where openness didn't happen.
>>> Personally I'm not convinced that OpenStreetMap really needs every building in the planet mapped in detail.
>> I don't wish to change your mind, but as you point out later, others seem to disagree with you, seeing the urgency with which these data enter OSM.
>>> The history was I was after the bus stops in Ottawa which meant I needed them with an open data license we could use.  I used to work at Stats Canada and the corporate culture is very different to OSM.
>> Understandable and nothing wrong with that, especially as OSM does not seek to house our data with Stats Canada.  However, the reverse...we know the story.
>>> In Canada we have fewer mappers on the ground and more places to map than in many parts of Europe.  We have a history of importing CANVEC data which comes from a number of sources including Municipalities.  So I acted in a coordinating role.  We managed to persuade the City of Ottawa to change it's open data license to align with the federal one.  I got my bus stops.  The local mappers were very much involved and there were at least half a dozen face to face meetings that took place.  I drifted down to one of them.
>>> Stats was very pleased with the added tags on the building outlines in Ottawa. This is information they felt could not be easily obtained in any other way.
>> Informative and appreciated.  There are "pockets of uniqueness" all over the world and hence methodologies of "this is a good match here" for data entering OSM which will and do widely differ around the world.  However, I believe all can agree that "quality data are quality data" (as well as the opposite) and for this fundamental reason, OSM has standards to follow.
>>> I am very aware that this data is important to many.  This includes Federal government departments and agencies.  They were very vocal at a meeting at Stats Canada during the HOT summit in Ottawa.  It was open and at least half a dozen OpenStreetMappers were present, three or four were from European or other out of town locations.  Having the building data in one place makes it much easier for the ed users than having to handle different formats and open data licenses.  Currently one municipal social agency is very interested in mapping places where fresh food can be obtained.  I forget some of the other interests but they were quite legitimate.  We have seen considerable interest by high schools and students in OpenStreetMap and using streetcomplete with building outlines is one way that they can add value without causing too much havoc.
>> These are precisely the sort of reasons why OSM (with high quality, usable, local data) is so important.  Nobody disagrees with "high value data provide high value solutions" as an equation that many use.  The "front end" of that, how the data enter, is obviously key here.
>>> After we imported Ottawa a group of mappers decided that we needed more buildings.  They organised mapathons with new mappers and mapped buildings with iD.  The results were not good and the data quality side was raised in talk-ca.  I was involved in one where I set up new mappers with JOSM and the buildings_tool plugin and that went much better as far as accuracy was concerned.
>> Indeed, this is a typical "use case" in OSM:  a feedback loop says "not good results," so improvements to process hopefully assure the next iteration yield better data/results.  Congratulations on those successes, they are more of the good stuff of which OSM is made.  "The journey is the reward" is part of what's important in the process.  Although, good data as a result is important, too.
>>> The result of these mapathons and the community reaction was to convince Stats Canada that releasing more building outlines as was done in Ottawa under an Open Data license was a way forward.  Kingston in particular was keen to release its building outlines and get them into OpenStreetMap.  Obtaining them and making them available was a Stats Canada decision and was made in their time frame.
>> But, was it made within OSM's OWN tenets and timeframes?  That's a crucial consideration I continue to feel receives short-shrift (as you seem in the mood to "confess").
>>> Given that Stats Canada released the data under an acceptable Open Data license I thought and still think the best way forward was to set up a plan and a process to import the data.  The alternative was probably going to be Ad-Hoc importing.
>> I, too, think (and OSM knows) that the best way forward (with importable data) is to set up a plan with process.  I thought we did so with the BC2020 "reboot."  Yet, it isn't working, or is only partially working with limited success (I'll look at that portion in the glass that partially fills it rather than calling it empty when it isn't).  So, yet again, let's do a mid-course (or perhaps early-course) correction and right the ship.  Really, we seem to largely agree!
>>> I suspect that talk-ca is probably the most appropriate mailing list for this sort of discussion which is why I emailed Nate directly.
>> We can move this to talk-ca if you like, I'm OK with that.
>> Thanks for continuing good dialog,
>> SteveA

Sent from Postbox 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20190117/827e3ab2/attachment-0001.html>

More information about the Imports mailing list