[Imports] Bing Building Import
jubalh at microsoft.com
Tue Jul 3 17:42:45 UTC 2018
The data are free of 3rd party rights. We are working through the internal process of releasing the labeled training data & the models. The original hand verified & digitized training data was released in April 2017 and is available here (https://wiki.openstreetmap.org/wiki/Microsoft_Building_Footprint_Data).
From: Christoph Hormann <osm at imagico.de>
Sent: Tuesday, July 3, 2018 2:38 AM
To: imports at openstreetmap.org
Subject: Re: [Imports] Bing Building Import
On Monday 02 July 2018, Greg Morgan wrote:
> I have started work on the Bing building import for Arizona us. I
> have started this page here
> 0 for the import. This wiki page may be used by other mappers in
> different states.
First thanks for bringing this up early in the process - although this is too early obviously for an import review it is good to have a broad discussion early.
A few points i would like to comment on:
* legal aspects: Microsoft released the data under the ODbL but does not specify what data sources go into producing it (in particular training data!) and does not make any claims that the data is free of third party rights. I would not be fine with importing data of unknow provenance and without a meaningful guarantee that it is free of third party rights.
* quality aspects: In contrast to almost all other data sets where there is some quantitative specification of quality (either explicitly or implicitly due to the purpose the data set is created for) there is no indication of quality in what Microsoft has released beyond the vague and meaningless 'awsome quality' claims. IMO this means that a proper import review would only be possible based on a thorough analysis of the quality of Microsoft's product that holds up to scientific scrutiny.
Regarding quality in general - you should not make the mistake of trying to assess quality by picking a few places and manually reviewing the data based on gut feeling - possibly with the same imagery used as reference as Microsoft used in data set generation. What i called "analysis of the quality that holds up to scientific scrutiny"
means picking a sufficiently large number of sample locations representative for the diverse geography of the US and doing a quantitative analysis based on reference data of known and high quality.
Microsoft's process documentation contains a number of hints that indicate things can go wrong in the process in ways that are likely to produce significant errors of kinds that are very unlikely to happen in manual mapping. Without having reliable data on how often these things do happen (and how this varies between different geographic settings) you would essentially be doing a blind import.
Imports mailing list
Imports at openstreetmap.org
More information about the Imports