[Talk-ca] Some feedback on import quality in Toronto
Tim Elrick
osm at elrick.de
Sun Feb 10 05:02:15 UTC 2019
Hi all,
After following the building import discussion for a while now, I wanted
to chime in as well.
After moving to Montréal from Germany recently, I got more engaged with
the local mappers here in MTL (beforehand, I was more analysing OSM data
scientifically).
I took part in the initial meeting of the Building Canada 2020
initiative, in which great interest in the project was expressed by many
institutions, organizations and businesses. However, apart from
Statistics Canada, municipalities and OSMappers no one seemed to be
willing to invest into the effort to support the initiative with
manpower or funding (to my knowledge). Therefore, I found it quite
impressive what StatCan has achieved with the Open Building Database and
do not share the view of some on this list that the initiative got off
on the wrong foot; but that all water under the bridge now.
So, yes, there seems to be some interest to use the data from the Open
Building Database in OSM easily. However, I am also hesitant, that one
massive import can be the answer.
I'm generally hesitant with imports as such, maybe because I was
acculturated in OSM in Germany where OSMappers value original entries
much more than secondary data. Further, I'm skeptical, that secondary
data is necessary better than original data (even from mapathons). I
initiated two mapathons with university students in the context of
Building Canada 2020. Both mapathons resulted in mostly nice buildings,
I would say - and, when there is the odd not-so-nice building, there is
still the validation step as we always used the tasking manager [1]. By
the way, both mapathons used the ID editor; and, of course, you can
square buildings in ID as well; so, I don't really understand the ID
editor bashing that appears on this list here now and then. That said,
of course, I prefer JOSM over ID as it is the more versatile tool, but
to introduce interested persons to editing in OSM, ID is really nice.
I'm even more skeptical about imports after Yaro pointed us to the Texas
import [2]. I wonder why there was no outcry there (or maybe there was
and I did not hear about it) - the imported data is terrible: no
parallel to street buildings, no right angles, sometimes even not the
right size of building parts. Fact is that secondary data buildings
footprints can be from many different data sources - from AutoCAD,
handdrawn by a municipal GIS experts to photogrammetric and satellite
machine learning sources; all those sources have their peculiarities,
which I think, you cannot satisfy in one import plan fits all -
especially, as the Open Building Database in Canada is stitched together
from those very different sources.
In Montreal, e.g., the source for the Open Building Database is the
données ouvertes des batiments. This is photogrammetric imagery probably
turned into AutoCAD files, which then were exported to a shapefile and
geojson. The building outlines are impressively precise, however, the
open data files contain building blocks not single buildings [3],
however, offer building dividers in a separate shapefile (I assume due
to the export from AutoCAD, see second image in [3]). Unfortunately, the
Open Building Database only included those building blocks in their data
set, making it not very easy to import into OSM (as they do not include
the building dividers). Hence, a bit of non-trivial pre-processing of
the original données ouvertes des batiments would be necessary to import
them into OSM (as the building divider file does also include roof
extensions and roof shapes). The local OSM group is discussing this
pre-processing for a while now at their local meetings (we started
discussing this even before the Building Canada 2020 initiative
started). As the City of Montreal has granted OSM the explicit use of
their open data file, the way forward, we think, is to pre-process the
original files. Further, there is extensive overlap of existing
buildings with the open data file. Therefore, the imports in Montreal
would have to happen in very small batches to not destroy the work of
other OSMappers.
I am also pretty skeptical about the simplification of the secondary
data before importing that was suggested on the list here. As the data
sources of the Open Building Database are very diverse, one
simplification method cannot fit all data sources and can lead to
harming the ground-truth principle. This even happened when Nate tried
to simplify buildings by hand in Toronto [4], as pointed out by Yaro.
There might be the odd case, where secondary data has too many nodes in
a straight line, but, usually, I would assume, that most data sources
stem from GIS experts or machine learning algorithms; neither would
include more nodes than necessary for a building outline. And honestly,
I don't buy the argument of 'too much data clutters our planet dump'.
Storage space and processing power is no longer an issue, and I would
like to see the world as precisely represented as possible in OSM; in
many parts of the OSM world you now find single trees, mailboxes and
lamp posts in OSM; isn't that great? As for buildings, I would like to
see all the bay windows, nooks and crannies - even in Canada.
How to proceed? For Montréal: After we looked more into the challenges
of pre-processing the Montreal open dataset, I guess, we will propose a
separate import plan. If anyone would like to join us in discussing the
pre-processing, please contact me and we can continue on the Montréal
OSM list. Oh, and by the way, while we all were discussing the import
since December almost 3,000 buildings were mapped by hand in the Greater
Montreal region [5].
That all being said, I do not want to stop anyone of you from importing
buildings. I just think, that we have to do this more bit by bit to
cater for all the peculiarities of the heterogeneous data sources of the
Open Building Database.
Happy mapping to everyone,
Tim
[1] see e.g. http://tasks.osmcanada.ca/project/91
[2] https://www.openstreetmap.org/#map=19/32.97102/-96.78231
[3] https://imgur.com/a/S8Nq5rg
[4] https://i.imgur.com/H10360K.png
[5] http://overpass-turbo.eu/s/FWH
On 2019-02-03 18:35, Yaro Shkvorets wrote:
Having reviewed the changeset, here are my 2 cents. OsmCha link for
reference: https://osmcha.mapbox.com/changesets/66881357/
1) IMO squaring is not needed in most of those cases.
- You can see difference between square and non-square ONLY at high zoom
level. And even then, it's not visible to the naked eye. We are talking
about inches here.
- Sometimes squaring is plain wrong to be applied here. Even though you
paid very close attention you managed to square a couple of non-square
buildings. Like this facade is not supposed to be square for example:
https://i.imgur.com/H10360K.png I might be OK with squaring
almost-square angles if there is a simple plugin for that. The way you
propose to do it, by going building-by-building and pressing Q is
completely unsustainable and sometimes makes things bad.
- Another thing, this particular neighbourhood is pretty dense and
mature and therefore has mostly square buildings. I can only imagine how
bad it would become if you ask people to square things in newer
developments where buildings often come in irregular shapes.
- Like mentioned above, many successful import didn't require squaring.
In this Texas one, 100% of buildings are not perfectly square:
https://www.openstreetmap.org/#map=19/32.97102/-96.78231
2) Simplification is good to have, sure. Obviously standard Shift-Y in
JOSM is a no-starter. If we can find a good way to simplify ways without
losing original geometry and causing overlapping issues we should do it.
But even then, reducing 500MB province extract to 499MB should not be a
hill to die on.
3) Manually mapping all the sheds and garages is completely
unsustainable. Having seen over the last couple of years how much real
interest there is in doing actual work importing buildings in Canada
(almost zero) adding this requirement will undoubtedly kill the project.
Sure you will meticulously map your own neighbourhood, but who will map
thousands of other places with the same attention to details? Also, you
did rather poor job at classifying buildings you add, tagging them all
with building=yes. Properly classifying secondary buildings like sheds
and garages in a project like this is pretty important IMO. I agree with
John, we should leave sheds to local mappers to trace manually.
To sum up, yes we can do better. But this is the perfect example when
"better" is the enemy of "good".
On Sun, Feb 3, 2019 at 12:34 PM Nate Wessel <bike756 at gmail.com
<mailto:bike756 at gmail.com>> wrote:
Hi all,
I had a chance this morning to work on cleaning up some of the
already-imported data in Toronto. I wanted to be a little methodical
about this, so I picked a single typical block near where I live.
All the building data on this block came from the import and I did
everything in one changeset:
https://www.openstreetmap.org/changeset/66881357
What I found was that:
1) Every single building needed squaring
2) Most buildings needed at least some simplification.
3) 42 buildings were missing.
I knew going in that the first two would be an issue, but what
really surprised me was just how many sheds had not been imported.
There are only 53 houses on the block, but 42
sheds/garages/outbuildings, some of them quite large, and none of
which had been mapped.
I haven't seen the quality of the outbuildings in the source data,
and maybe I would change my mind if I did, but I think if we're
going to do this import properly, we're going to have to bring in
the other half of the data. I had seen in the original import
instructions that small buildings were being excluded - was there a
reason for this?
I also want to say: given how long it took me to clean up and
properly remap this one block, I'll say again that the size of the
import tasks is way, way, way too large. There is absolutely no way
that someone could have carefully looked at and verified this data
as it was going in. I just spent a half hour fixing up probably
about one-hundredth of a task square.
We can do better than this!
--
Nate Wessel
Jack of all trades, Master of Geography, PhD candidate in Urban Planning
NateWessel.com <http://natewessel.com>
_______________________________________________
Talk-ca mailing list
Talk-ca at openstreetmap.org <mailto:Talk-ca at openstreetmap.org>
https://lists.openstreetmap.org/listinfo/talk-ca
--
Best Regards,
Yaro Shkvorets
_______________________________________________
Talk-ca mailing list
Talk-ca at openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-ca/attachments/20190210/5136be0f/attachment-0001.html>
More information about the Talk-ca
mailing list