[Talk-ca] Some feedback on import quality in Toronto

john whelan jwhelan0112 at gmail.com
Sat Feb 16 16:21:16 UTC 2019


When you look at importing Montreal you might like to look at the following
first.

https://wiki.osmfoundation.org/wiki/OGL_Canada_and_local_variants

Note if the Montreal data in available through Stats Can and the federal
government open data license it might be better to use that data source
from a licensing perspective.

Although data can be given to OpenStreetMap I don't think there in a
foolproof method of recording the fact.  If one person has the paper record
fine but if they are no longer part of the community then there maybe a
problem if the license is challenged.

Cheerio John

On Sun, 10 Feb 2019 at 00:04, Tim Elrick <osm at elrick.de> wrote:

> Hi all,
>
> After following the building import discussion for a while now, I wanted
> to chime in as well.
>
> After moving to Montréal from Germany recently, I got more engaged with
> the local mappers here in MTL (beforehand, I was more analysing OSM data
> scientifically).
>
> I took part in the initial meeting of the Building Canada 2020 initiative,
> in which great interest in the project was expressed by many institutions,
> organizations and businesses. However, apart from Statistics Canada,
> municipalities and OSMappers no one seemed to be willing to invest into the
> effort to support the initiative with manpower or funding (to my
> knowledge). Therefore, I found it quite impressive what StatCan has
> achieved with the Open Building Database and do not share the view of some
> on this list that the initiative got off on the wrong foot; but that all
> water under the bridge now.
>
> So, yes, there seems to be some interest to use the data from the Open
> Building Database in OSM easily. However, I am also hesitant, that one
> massive import can be the answer.
>
> I'm generally hesitant with imports as such, maybe because I was
> acculturated in OSM in Germany where OSMappers value original entries much
> more than secondary data. Further, I'm skeptical, that secondary data is
> necessary better than original data (even from mapathons). I initiated two
> mapathons with university students in the context of Building Canada 2020.
> Both mapathons resulted in mostly nice buildings, I would say - and, when
> there is the odd not-so-nice building, there is still the validation step
> as we always used the tasking manager [1]. By the way, both mapathons used
> the ID editor; and, of course, you can square buildings in ID as well; so,
> I don't really understand the ID editor bashing that appears on this list
> here now and then. That said, of course, I prefer JOSM over ID as it is the
> more versatile tool, but to introduce interested persons to editing in OSM,
> ID is really nice.
>
> I'm even more skeptical about imports after Yaro pointed us to the Texas
> import [2]. I wonder why there was no outcry there (or maybe there was and
> I did not hear about it) - the imported data is terrible: no parallel to
> street buildings, no right angles, sometimes even not the right size of
> building parts. Fact is that secondary data buildings footprints can be
> from many different data sources - from AutoCAD, handdrawn by a municipal
> GIS experts to photogrammetric and satellite machine learning sources; all
> those sources have their peculiarities, which I think, you cannot satisfy
> in one import plan fits all - especially, as the Open Building Database in
> Canada is stitched together from those very different sources.
>
> In Montreal, e.g., the source for the Open Building Database is the
> données ouvertes des batiments. This is photogrammetric imagery probably
> turned into AutoCAD files, which then were exported to a shapefile and
> geojson. The building outlines are impressively precise, however, the open
> data files contain building blocks not single buildings [3], however, offer
> building dividers in a separate shapefile (I assume due to the export from
> AutoCAD, see second image in [3]). Unfortunately, the Open Building
> Database only included those building blocks in their data set, making it
> not very easy to import into OSM (as they do not include the building
> dividers). Hence, a bit of non-trivial pre-processing of the original
> données ouvertes des batiments would be necessary to import them into OSM
> (as the building divider file does also include roof extensions and roof
> shapes). The local OSM group is discussing this pre-processing for a while
> now at their local meetings (we started discussing this even before the
> Building Canada 2020 initiative started). As the City of Montreal has
> granted OSM the explicit use of their open data file, the way forward, we
> think, is to pre-process the original files. Further, there is extensive
> overlap of existing buildings with the open data file. Therefore, the
> imports in Montreal would have to happen in very small batches to not
> destroy the work of other OSMappers.
>
> I am also pretty skeptical about the simplification of the secondary data
> before importing that was suggested on the list here. As the data sources
> of the Open Building Database are very diverse, one simplification method
> cannot fit all data sources and can lead to harming the ground-truth
> principle. This even happened when Nate tried to simplify buildings by hand
> in Toronto [4], as pointed out by Yaro. There might be the odd case, where
> secondary data has too many nodes in a straight line, but, usually, I would
> assume, that most data sources stem from GIS experts or machine learning
> algorithms; neither would include more nodes than necessary for a building
> outline. And honestly, I don't buy the argument of 'too much data clutters
> our planet dump'. Storage space and processing power is no longer an issue,
> and I would like to see the world as precisely represented as possible in
> OSM; in many parts of the OSM world you now find single trees, mailboxes
> and lamp posts in OSM; isn't that great? As for buildings, I would like to
> see all the bay windows, nooks and crannies - even in Canada.
>
> How to proceed? For Montréal: After we looked more into the challenges of
> pre-processing the Montreal open dataset, I guess, we will propose a
> separate import plan. If anyone would like to join us in discussing the
> pre-processing, please contact me and we can continue on the Montréal OSM
> list. Oh, and by the way, while we all were discussing the import since
> December almost 3,000 buildings were mapped by hand in the Greater Montreal
> region [5].
>
> That all being said, I do not want to stop anyone of you from importing
> buildings. I just think, that we have to do this more bit by bit to cater
> for all the peculiarities of the heterogeneous data sources of the Open
> Building Database.
>
> Happy mapping to everyone,
> Tim
>
> [1] see e.g. http://tasks.osmcanada.ca/project/91
> [2] https://www.openstreetmap.org/#map=19/32.97102/-96.78231
> [3] https://imgur.com/a/S8Nq5rg
> [4] https://i.imgur.com/H10360K.png
> [5] http://overpass-turbo.eu/s/FWH
>
> On 2019-02-03 18:35, Yaro Shkvorets wrote:
> Having reviewed the changeset, here are my 2 cents. OsmCha link for
> reference: https://osmcha.mapbox.com/changesets/66881357/
>
> 1) IMO squaring is not needed in most of those cases.
> - You can see difference between square and non-square ONLY at high zoom
> level. And even then, it's not visible to the naked eye. We are talking
> about inches here.
> - Sometimes squaring is plain wrong to be applied here. Even though you
> paid very close attention you managed to square a couple of non-square
> buildings. Like this facade is not supposed to be square for example:
> https://i.imgur.com/H10360K.png I might be OK with squaring almost-square
> angles if there is a simple plugin for that. The way you propose to do it,
> by going building-by-building and pressing Q is completely unsustainable
> and sometimes makes things bad.
> - Another thing, this particular neighbourhood is pretty dense and mature
> and therefore has mostly square buildings. I can only imagine how bad it
> would become if you ask people to square things in newer developments where
> buildings often come in irregular shapes.
> - Like mentioned above, many successful import didn't require squaring. In
> this Texas one, 100% of buildings are not perfectly square:
> https://www.openstreetmap.org/#map=19/32.97102/-96.78231
>
>
> 2) Simplification is good to have, sure. Obviously standard Shift-Y in
> JOSM is a no-starter. If we can find a good way to simplify ways without
> losing original geometry and causing overlapping issues we should do it.
> But even then, reducing 500MB province extract to 499MB should not be a
> hill to die on.
>
> 3) Manually mapping all the sheds and garages is completely unsustainable.
> Having seen over the last couple of years how much real interest there is
> in doing actual work importing buildings in Canada (almost zero) adding
> this requirement will undoubtedly kill the project. Sure you will
> meticulously map your own neighbourhood, but who will map thousands of
> other places with the same attention to details? Also, you did rather poor
> job at classifying buildings you add, tagging them all with building=yes.
> Properly classifying secondary buildings like sheds and garages in a
> project like this is pretty important IMO. I agree with John, we should
> leave sheds to local mappers to trace manually.
>
> To sum up, yes we can do better. But this is the perfect example when
> "better" is the enemy of "good".
>
> On Sun, Feb 3, 2019 at 12:34 PM Nate Wessel <bike756 at gmail.com> wrote:
>
>> Hi all,
>>
>> I had a chance this morning to work on cleaning up some of the
>> already-imported data in Toronto. I wanted to be a little methodical about
>> this, so I picked a single typical block near where I live. All the
>> building data on this block came from the import and I did everything in
>> one changeset: https://www.openstreetmap.org/changeset/66881357
>>
>> What I found was that:
>>
>> 1) Every single building needed squaring
>>
>> 2) Most buildings needed at least some simplification.
>>
>> 3) 42 buildings were missing.
>>
>> I knew going in that the first two would be an issue, but what really
>> surprised me was just how many sheds had not been imported. There are only
>> 53 houses on the block, but 42 sheds/garages/outbuildings, some of them
>> quite large, and none of which had been mapped.
>>
>> I haven't seen the quality of the outbuildings in the source data, and
>> maybe I would change my mind if I did, but I think if we're going to do
>> this import properly, we're going to have to bring in the other half of the
>> data. I had seen in the original import instructions that small buildings
>> were being excluded - was there a reason for this?
>>
>> I also want to say: given how long it took me to clean up and properly
>> remap this one block, I'll say again that the size of the import tasks is
>> way, way, way too large. There is absolutely no way that someone could have
>> carefully looked at and verified this data as it was going in. I just spent
>> a half hour fixing up probably about one-hundredth of a task square.
>>
>> We can do better than this!
>> --
>> Nate Wessel
>> Jack of all trades, Master of Geography, PhD candidate in Urban Planning
>> NateWessel.com <http://natewessel.com>
>>
>> _______________________________________________
>> Talk-ca mailing list
>> Talk-ca at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk-ca
>>
>
>
> --
> Best Regards,
>           Yaro Shkvorets
>
> _______________________________________________
> Talk-ca mailing listTalk-ca at openstreetmap.orghttps://lists.openstreetmap.org/listinfo/talk-ca
>
>
> _______________________________________________
> Talk-ca mailing list
> Talk-ca at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk-ca
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-ca/attachments/20190216/8616ac19/attachment-0001.html>


More information about the Talk-ca mailing list