[Talk-ca] Importing buildings in Canada

John Whelan jwhelan0112 at gmail.com
Sat Sep 28 18:04:25 UTC 2019


And I totally agree.  Because the Stat Can data has come from many 
sources the data quality is variable to put it politely.  The Microsoft 
data has been shown in the US to also be of variable quality.  I'm not 
so sure about the NR Can LiDAR data hopefully it is at least consistent.

If we look at the history of the project then we can get an idea of how 
we came to be where we are.

First I wanted to import all the bus stops in Ottawa because only by 
importing could you ensure you had all the stops with their reference 
numbers but the City of Ottawa Open Data license did not align with 
OSM.  I was also talking to Treasury Board and explained to them that 
their Open Data license version 1 didn't align with OSM so we couldn't 
use their data.  Five years later TB released their Open Data license 
version 2 which they felt did align.

Stat Can has two types of project, pilot ones and ones that earn money. 
The original pilot was based on Ottawa and Gatineau and was for two 
years.  Their original plan was mapathons using iD.  I was impressed 
when a stat can employee managed to accurately map a building using iD 
during a presentation.  I hadn't thought it was possible after some of 
the efforts I'd seen in HOT mapping.   Fine except that it requires a 
lot of mappers and I think Fredrick has commented this sort of mapping 
with new mappers needs a lot of clean up effort sometimes more than 
required for an experienced mapper to map it right in the first place. 
Montreal has identified there just aren't enough experienced mappers 
available.

I worked at Stats Canada for a number of years.  The corporate culture 
is very different to OSM.  It makes its money by selling data.  Want to 
open a new coffee bar? Stats Canada will combine its data to sell you 
the ideal spot based on residents' income etc..  I had a meeting with 
Stats Canada, City of Ottawa planning department, an Open Data 
specialist from Carlton University, someone from Metrolink who had added 
data to openstreetmap to help people find the nearest bus stop and a 
couple of HOT board members.   We convinced Stats Canada to change the 
direction of the pilot to use Open Data rather than go the mapathon 
route partly for data quality reasons and partly because I didn't think 
we could find the mappers to map the buildings completely.  The Stat Can 
involvement meant the City of Ottawa was persuaded to change its Open 
Data license to the same as the TB one.  That took time and had to go to 
council for approval.  There was a lot of discussion with the local 
community and it was they who organised and did the import.   The local 
group worked nicely together and had a range of skill sets in the 
group.  I actually played more of a connecting role than anything else.

The import was challenged on the data license amongst other things but 
eventually the OSM legal working group was very kind and ruled the 
license was acceptable.  Stats is very interested in added detail to 
buildings.  I was very interested that we could now import the bus stops.

I think you picked up on the fact that the buildings mapped in a 
mapathon were less than ideal.  I was involved in one in Ottawa and just 
taught the new mappers to use JOSM and the building_tool.  That produced 
more buildings per mapper hour and they were fairly accurate.  I must 
confess not every attached garage was mapped in detail.

I seem to recall Mapbox being involved in the Maperthons in some way.

The Stats Can involvement meant we saw some interest from schools.  What 
I was interested in was added detail so mapped a couple of thousand 
buildings in Ontario using JONM and the building_tool so details could 
be added easily.  We got two addresses added.  Apparently in Ontario the 
provincial government has purchased ESRI for school children to learn 
about GIS.

At the end of the pilot the money had run out.  Stats covered some of 
the costs involved in the HOT summit that was held in Ottawa and during 
that summit phase two was launched but without any real funding.

What Stats could do though was release data from the municipalities 
under the government Open Data license and that is what they did.  As 
Jarek has pointed out following the import process is stressful so I 
volunteered to do the paperwork and submit the plan.  There was some 
discussion on talk ca and the idea surfaced to go with one plan rather 
than divide the country up.  So that's what I did.

Today we have three sources of data that could be imported, and I 
suspect the two that are not municipal data are more consistent.  We 
still have the original plan of mapathons with iD floating around.

My person view is the imported data quality is better than the mapathon 
approach but to go forward from here I think it needs to be re-planned 
and a new import plan(s) drawn up.

I don't think Stats have any real funding available at the moment.  They 
may find an odd hour in a quiet time but its coming up to March 31th and 
deadline time so I don't expect any major resources to be made available 
from them certainly not of the data clean up variety.

Cheerio John

Pierre Béland wrote on 2019-09-28 12:20 PM:
> Je comprends que c'est la saison des tomates. Mais essayons de les 
> utiliser pour nos conserves et non comme argument pour convaincre les 
> autres contributeurs ! ;)
>
> Comme les autres l'ont exprimé, c'est à ceux qui proposent de faire 
> des imports de bien documenter le processus, non l'inverse. Et les 
> menaces d'agir de façon impériale et négliger les communautés locales, 
> cela ne tient évidemment pas la route.
>
> Pour discuter sur la qualité des données, il est nécessaire de pouvoir 
> facilement examiner les données. Et je ne penses pas que les données 
> soient comparables d'un endroit à l'autre. La qualité des images, la 
> densité du bâti en milieu urbains sont autant de facteurs.
>
> Les fichiers accessibles aussi bien pour StatCan que Microsoft sont 
> très gros. Simplement pour analyser les données de nos municipalités 
> respectives, il faut traiter de gros fichiers et tenter d'extraire les 
> données. Ce qui n'est pas nécessairement facile et va bien sûr limiter 
> la participation.
>
> Question de donner des exemples sur les limites d'observation des 
> images par les technique de AI, j'ai publié des images avec les 2 
> tweets suivants montrant des bâtiments au centre de Toronto :
>
> https://twitter.com/pierzen/status/1177976517902684160
>
> https://twitter.com/pierzen/status/1177978125377884160
>
> On voit bien qu'il ne suffit pas de valider si les angles sont droits. 
> Ces exemples montrent bien comment le tracé peut varier 
> significativement vs la réalité au sol. Et tout comme les humains, les 
> techniques de AI ont de la difficulté à identifier les bâtiments 
> individuels.
>
> cordialement

-- 
Sent from Postbox <https://www.postbox-inc.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-ca/attachments/20190928/2fa5bce5/attachment-0001.html>


More information about the Talk-ca mailing list