[Talk-us] Proposed mechanical edit - New York building footprints
Greg Troxel
gdt at lexort.com
Sat Mar 19 13:51:01 UTC 2022
Kevin Kenny <kevin.b.kenny at gmail.com> writes:
> If anyone can give me useful advice regarding managing an edit this big,
> I'm all ears. I certainly don't want to manage 60,000 changes by accepting
> them one at a time in JOSM!
From your description, I think you are doing this right.
When we (the MA osm community) imported building footprints from
MassGIS, the basic process was to take that dataset (about 2M
footprints) and to basically run a diff/overlap against OSM and produce
footprints that are in MassGIS and do not overlap anything in OSM.
This is the candidate import set. It as big, as most of MA did not
have buildings imported befoer and most buildings had not been
hand-drawn. I'll guess over 1.5M.
Footprints in MassGIS that overlap. This is just something to look at
to see if hand editing is helpful. We didn't really dig into it, and
we certainly didn't upload it.
It's just code to make this, and it produces really large output.
Obviously your rules for sorting and your code are going to be quite
different, and I'd recommend having this all in postgis.
In MA, we are very city/town focused (all land is in one), and so is
MassGIS, so these two files above were split into (and maybe were
processed this way from per-town building data, same outcome) two files
per town. For many towns, this was something that the local mapper can
spend an hour or two reviewing the candidate import file and see if
there is indeed no overlapping that seems bad, if ~all the building
outlines correspond to buildings in imagery, etc. After that kind of
review a town would be uploaded. I reviewed my town, which has about
2000 houses and a bunch of outbuildings. I found some that were not
visible in imagery (they're from LIDAR) but spot-checking on the ground,
almost all were there. So I convinced myself that the data was high
quality.
I then checked a few bordering towns and those were ok but I spent less
time on them.
Then, as time went on, and we kept checking towns, we became more
confident that things were ok, and we gradually turned down the checking
level. IIRC a few towns had systematic spatial offsets and were treated
specially or omitted, so we were careful to check each town's data for
that, but otherwise no real problems were found. Years later no
significant problems have been found.
So you might want to chunk the output into groups of about 500 changes
and then hand-review a few, also outsourcing this to other NY mappers.
And then after a chunk is reviewed, apply it, which might surface issues
the reviewers didn't anticipate, and then keep reviewing and uploading,
and turn down the review level and pick up the pace over time, perhaps
to just about no review on the 2nd half of the chunks.
The big thing is really looking at the first chunk, and waiting a week
after uploading it to make sure there aren't issues. Let it flow into
nomimatim, main render, osmand live, other places and have a look.
It also probably helps to have the code such that it can be rerun to
produce a fresh candidate changeset, after some fixes.
(Then later you could do a different process, which basically diffs the
external data against osm and produces output files of where they
differ, for hand review and thinking about. But that's not what you
asked.)
Hope this helps - what we did and what you are doing aree different of
course,
Greg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20220319/095dda08/attachment.sig>
More information about the Talk-us
mailing list