[Talk-ca] Some feedback on import quality in Toronto
Tim Elrick
osm at elrick.de
Sat Feb 16 17:22:02 UTC 2019
Hi John,
Thanks for pointing me to the license website. The open data of the City
of Montreal is licensed CC-BY 4.0 and the City has explicitly granted
OSM the right to use the data on top of that. See:
http://donnees.ville.montreal.qc.ca/portail/licence/
StatsCan's Open Building Database uses exactly the same data source,
however, as I pointed out in my last e-mail, it did not split the
building blocks into actual buildings. The open data of the City of
Montreal, furthermore, includes building heights which are lost in the
OBD. These are the reasons why we would like to import the original open
data.
Cheers,
Tim
On 2019-02-16 11:21, john whelan wrote:
When you look at importing Montreal you might like to look at the
following first.
https://wiki.osmfoundation.org/wiki/OGL_Canada_and_local_variants
Note if the Montreal data in available through Stats Can and the federal
government open data license it might be better to use that data source
from a licensing perspective.
Although data can be given to OpenStreetMap I don't think there in a
foolproof method of recording the fact. If one person has the paper
record fine but if they are no longer part of the community then there
maybe a problem if the license is challenged.
Cheerio John
On Sun, 10 Feb 2019 at 00:04, Tim Elrick <osm at elrick.de
<mailto:osm at elrick.de>> wrote:
Hi all,
After following the building import discussion for a while now, I
wanted to chime in as well.
After moving to Montréal from Germany recently, I got more engaged
with the local mappers here in MTL (beforehand, I was more analysing
OSM data scientifically).
I took part in the initial meeting of the Building Canada 2020
initiative, in which great interest in the project was expressed by
many institutions, organizations and businesses. However, apart from
Statistics Canada, municipalities and OSMappers no one seemed to be
willing to invest into the effort to support the initiative with
manpower or funding (to my knowledge). Therefore, I found it quite
impressive what StatCan has achieved with the Open Building Database
and do not share the view of some on this list that the initiative
got off on the wrong foot; but that all water under the bridge now.
So, yes, there seems to be some interest to use the data from the
Open Building Database in OSM easily. However, I am also hesitant,
that one massive import can be the answer.
I'm generally hesitant with imports as such, maybe because I was
acculturated in OSM in Germany where OSMappers value original
entries much more than secondary data. Further, I'm skeptical, that
secondary data is necessary better than original data (even from
mapathons). I initiated two mapathons with university students in
the context of Building Canada 2020. Both mapathons resulted in
mostly nice buildings, I would say - and, when there is the odd
not-so-nice building, there is still the validation step as we
always used the tasking manager [1]. By the way, both mapathons used
the ID editor; and, of course, you can square buildings in ID as
well; so, I don't really understand the ID editor bashing that
appears on this list here now and then. That said, of course, I
prefer JOSM over ID as it is the more versatile tool, but to
introduce interested persons to editing in OSM, ID is really nice.
I'm even more skeptical about imports after Yaro pointed us to the
Texas import [2]. I wonder why there was no outcry there (or maybe
there was and I did not hear about it) - the imported data is
terrible: no parallel to street buildings, no right angles,
sometimes even not the right size of building parts. Fact is that
secondary data buildings footprints can be from many different data
sources - from AutoCAD, handdrawn by a municipal GIS experts to
photogrammetric and satellite machine learning sources; all those
sources have their peculiarities, which I think, you cannot satisfy
in one import plan fits all - especially, as the Open Building
Database in Canada is stitched together from those very different
sources.
In Montreal, e.g., the source for the Open Building Database is the
données ouvertes des batiments. This is photogrammetric imagery
probably turned into AutoCAD files, which then were exported to a
shapefile and geojson. The building outlines are impressively
precise, however, the open data files contain building blocks not
single buildings [3], however, offer building dividers in a separate
shapefile (I assume due to the export from AutoCAD, see second image
in [3]). Unfortunately, the Open Building Database only included
those building blocks in their data set, making it not very easy to
import into OSM (as they do not include the building dividers).
Hence, a bit of non-trivial pre-processing of the original données
ouvertes des batiments would be necessary to import them into OSM
(as the building divider file does also include roof extensions and
roof shapes). The local OSM group is discussing this pre-processing
for a while now at their local meetings (we started discussing this
even before the Building Canada 2020 initiative started). As the
City of Montreal has granted OSM the explicit use of their open data
file, the way forward, we think, is to pre-process the original
files. Further, there is extensive overlap of existing buildings
with the open data file. Therefore, the imports in Montreal would
have to happen in very small batches to not destroy the work of
other OSMappers.
I am also pretty skeptical about the simplification of the secondary
data before importing that was suggested on the list here. As the
data sources of the Open Building Database are very diverse, one
simplification method cannot fit all data sources and can lead to
harming the ground-truth principle. This even happened when Nate
tried to simplify buildings by hand in Toronto [4], as pointed out
by Yaro. There might be the odd case, where secondary data has too
many nodes in a straight line, but, usually, I would assume, that
most data sources stem from GIS experts or machine learning
algorithms; neither would include more nodes than necessary for a
building outline. And honestly, I don't buy the argument of 'too
much data clutters our planet dump'. Storage space and processing
power is no longer an issue, and I would like to see the world as
precisely represented as possible in OSM; in many parts of the OSM
world you now find single trees, mailboxes and lamp posts in OSM;
isn't that great? As for buildings, I would like to see all the bay
windows, nooks and crannies - even in Canada.
How to proceed? For Montréal: After we looked more into the
challenges of pre-processing the Montreal open dataset, I guess, we
will propose a separate import plan. If anyone would like to join us
in discussing the pre-processing, please contact me and we can
continue on the Montréal OSM list. Oh, and by the way, while we all
were discussing the import since December almost 3,000 buildings
were mapped by hand in the Greater Montreal region [5].
That all being said, I do not want to stop anyone of you from
importing buildings. I just think, that we have to do this more bit
by bit to cater for all the peculiarities of the heterogeneous data
sources of the Open Building Database.
Happy mapping to everyone,
Tim
[1] see e.g. http://tasks.osmcanada.ca/project/91
[2] https://www.openstreetmap.org/#map=19/32.97102/-96.78231
[3] https://imgur.com/a/S8Nq5rg
[4] https://i.imgur.com/H10360K.png
[5] http://overpass-turbo.eu/s/FWH
On 2019-02-03 18:35, Yaro Shkvorets wrote:
Having reviewed the changeset, here are my 2 cents. OsmCha link for
reference: https://osmcha.mapbox.com/changesets/66881357/
1) IMO squaring is not needed in most of those cases.
- You can see difference between square and non-square ONLY at high
zoom level. And even then, it's not visible to the naked eye. We are
talking about inches here.
- Sometimes squaring is plain wrong to be applied here. Even though
you paid very close attention you managed to square a couple of
non-square buildings. Like this facade is not supposed to be square
for example: https://i.imgur.com/H10360K.png I might be OK with
squaring almost-square angles if there is a simple plugin for that.
The way you propose to do it, by going building-by-building and
pressing Q is completely unsustainable and sometimes makes things bad.
- Another thing, this particular neighbourhood is pretty dense and
mature and therefore has mostly square buildings. I can only imagine
how bad it would become if you ask people to square things in newer
developments where buildings often come in irregular shapes.
- Like mentioned above, many successful import didn't require
squaring. In this Texas one, 100% of buildings are not perfectly
square: https://www.openstreetmap.org/#map=19/32.97102/-96.78231
2) Simplification is good to have, sure. Obviously standard Shift-Y
in JOSM is a no-starter. If we can find a good way to simplify ways
without losing original geometry and causing overlapping issues we
should do it. But even then, reducing 500MB province extract to
499MB should not be a hill to die on.
3) Manually mapping all the sheds and garages is completely
unsustainable. Having seen over the last couple of years how much
real interest there is in doing actual work importing buildings in
Canada (almost zero) adding this requirement will undoubtedly kill
the project. Sure you will meticulously map your own neighbourhood,
but who will map thousands of other places with the same attention
to details? Also, you did rather poor job at classifying buildings
you add, tagging them all with building=yes. Properly classifying
secondary buildings like sheds and garages in a project like this is
pretty important IMO. I agree with John, we should leave sheds to
local mappers to trace manually.
To sum up, yes we can do better. But this is the perfect example
when "better" is the enemy of "good".
On Sun, Feb 3, 2019 at 12:34 PM Nate Wessel <bike756 at gmail.com
<mailto:bike756 at gmail.com>> wrote:
Hi all,
I had a chance this morning to work on cleaning up some of the
already-imported data in Toronto. I wanted to be a little
methodical about this, so I picked a single typical block near
where I live. All the building data on this block came from the
import and I did everything in one changeset:
https://www.openstreetmap.org/changeset/66881357
What I found was that:
1) Every single building needed squaring
2) Most buildings needed at least some simplification.
3) 42 buildings were missing.
I knew going in that the first two would be an issue, but what
really surprised me was just how many sheds had not been
imported. There are only 53 houses on the block, but 42
sheds/garages/outbuildings, some of them quite large, and none
of which had been mapped.
I haven't seen the quality of the outbuildings in the source
data, and maybe I would change my mind if I did, but I think if
we're going to do this import properly, we're going to have to
bring in the other half of the data. I had seen in the original
import instructions that small buildings were being excluded -
was there a reason for this?
I also want to say: given how long it took me to clean up and
properly remap this one block, I'll say again that the size of
the import tasks is way, way, way too large. There is absolutely
no way that someone could have carefully looked at and verified
this data as it was going in. I just spent a half hour fixing up
probably about one-hundredth of a task square.
We can do better than this!
--
Nate Wessel
Jack of all trades, Master of Geography, PhD candidate in Urban
Planning
NateWessel.com <http://natewessel.com>
_______________________________________________
Talk-ca mailing list
Talk-ca at openstreetmap.org <mailto:Talk-ca at openstreetmap.org>
https://lists.openstreetmap.org/listinfo/talk-ca
--
Best Regards,
Yaro Shkvorets
_______________________________________________
Talk-ca mailing list
Talk-ca at openstreetmap.org <mailto:Talk-ca at openstreetmap.org>
https://lists.openstreetmap.org/listinfo/talk-ca
_______________________________________________
Talk-ca mailing list
Talk-ca at openstreetmap.org <mailto:Talk-ca at openstreetmap.org>
https://lists.openstreetmap.org/listinfo/talk-ca
More information about the Talk-ca
mailing list