[Imports] Questions about DC import plan

Wed Aug 6 20:44:32 UTC 2014

Posting this to the OSM Imports and the MappingDC lists:

I've been reviewing the work by OCTO on DC building/address imports
currently being conducted via the tasking manager[1] and I've noticed a few
things that raised some questions about this plan.  Excuse me if I don't
understand this or if there are obvious answers to these questions.  I'm
new here.

>From my understanding, the current plan is to import all DC buildings and
addresses in areas that were not imported in 2008-10.  Tags were converted,
building shapes simplified, and then the tags from MAR points[2] were
joined to building footprints[3] when a building was found to contain only
one address point, and the point deleted.  Data was then split up into
chunks by block group and converted to .osm files. Scripts found in the
dcbuildings repo[4]

After loading some of block group chunks using the tasking manager I've
found that many are completely empty (0 nodes).  The explanation I was
given by OCTO was:

" Many of the chunks created by mapbox did not contain any data, this is
due to the fact that these areas already had complete or near complete
building / addresses in OSM for that area."

My questions are:

1) Are these chunks empty because there was found to be no difference from
what already existed in OSM from previous imports?

I originally thought that the entire DC dataset would be represented in
these chunks. If chunks containing only "diffs" was the plan, I couldn't
find a clear description of this or representation in the code.

2) Why was the decision made to use the tasking manager to conduct a
partial import as opposed to conflating the entire dataset, given that the
new import process is adding not only building footprints but
footprints+addresses?

I found examples of large areas of block groups with previously imported
buildings lacking addresses that corresponded to empty chunk files.  I
looked at the raw data and these should contain buildings with joined
address tags that could be copied to the existing buildings.  Currently
there are large swaths of buildings with no addresses.  Will this be taken
care of in a separate process that I am unaware of?

I also found previously imported buildings that does not exist in the
current dataset (or in any satellite imagery that I could find).  Since
this import is semi-automated, wouldn't a comparison of the full dataset
allowed for someone to spot this and remove it?

Thanks,

-Brandon
@geobrando

[1]http://tasks.openstreetmap.us/job/6
[2]http://data.dc.gov/Metadata.aspx?id=190
[3]http://data.dc.gov/Metadata.aspx?id=59
[4]https://github.com/osmlab/dcbuildings
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20140806/435f5310/attachment.html>