<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Mar 19, 2022 at 9:51 AM Greg Troxel <<a href="mailto:gdt@lexort.com">gdt@lexort.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
Kevin Kenny <<a href="mailto:kevin.b.kenny@gmail.com" target="_blank">kevin.b.kenny@gmail.com</a>> writes:<br>
<br>
> If anyone can give me useful advice regarding managing an edit this big,<br>
> I'm all ears. I certainly don't want to manage 60,000 changes by accepting<br>
> them one at a time in JOSM!<br>
<br>
>From your description, I think you are doing this right.<br>
<br>
When we (the MA osm community) imported building footprints from<br>
MassGIS, the basic process was to take that dataset (about 2M<br>
footprints) and to basically run a diff/overlap against OSM and produce<br>
<br>
footprints that are in MassGIS and do not overlap anything in OSM.<br>
This is the candidate import set. It as big, as most of MA did not<br>
have buildings imported befoer and most buildings had not been<br>
hand-drawn. I'll guess over 1.5M.<br>
<br>
Footprints in MassGIS that overlap. This is just something to look at<br>
to see if hand editing is helpful. We didn't really dig into it, and<br>
we certainly didn't upload it.<br>
<br>
It's just code to make this, and it produces really large output.<br>
Obviously your rules for sorting and your code are going to be quite<br>
different, and I'd recommend having this all in postgis.<br></blockquote><div><br></div><div>Thanks for the insights; this is quite helpful. And yeah, I'm using PostGIS quite heavily. If you looked at the GitHub page, you'll probably suspect that the two maps of Suffolk County were produced by doing the data reduction in PostGIS and the data presentation in QGIS - and you'd be right. For those maps, the query that did the heavy lifting is at <a href="https://github.com/kennykb/NYbuildings_repair/blob/main/analyze-changesets.tcl#L320">https://github.com/kennykb/NYbuildings_repair/blob/main/analyze-changesets.tcl#L320</a> The conversion to UTM (EPSG:32618) from latitude/longitude was so that the buffer would get applied with a constant scale.</div><div><br></div><div>A building footprint import is geometry-heavy, and so your workflow is quite different from, and heavier-weight, than the relatively limited repair that I'm proposing here. In my case, the buildings have been imported already (with addresses). Separately, a statewide import of address points was also performed. The latter import studiously avoided changing address information that was already present.</div><div><br></div><div>The building footprints themselves are of quite low quality, but to me they are what they are. I'm hoping that having discovered this mess doesn't saddle me with the burden of leading the entire effort to fix the footprints. Instead, I'm proposing just to apply a 95% or 99% fix to the building _addresses_. The result won't be of nearly the quality that would have been achieved had buildings been imported with coordinated review - indeed, in that case, the Microsoft footprints wouldn't have been imported directly at all, at least in the city centers, but merely served as a base outside OSM for mappers to build on. But at least a few tens of thousands of building footprints of unknown quality won't have demonstrably incorrect street addresses.</div><div><br></div><div>Since the geometry isn't going to change in the more limited process, there's much less need for the full weight of .osc files, and indeed, I'm thinking in terms of not producing them at all (except as a required part of a mechanical edit review). It turns out that the JOSM remote control API has a function that is nearly ideal for this. For a particular discrete change (which would likely be a span like 'West Main Street in ZIP code 13357," it's possible, indeed easy¸ to make a remote control URL that causes JOSM to download the required ways, make the tagging changes, and await further input. Essentially, the URL is a command: "download ways 816043005,816043002,816043003,816042998,... and set addr:street="West Main Street" on all of them." That's a lot less unrelated information, and consequently a lot less opportunity for the data to go stale.</div><div><br></div><div>I think that for producing changeset files, that's the process I'd follow. Let JOSM retrieve the data and apply the tags, then save a changeset from that.</div><div><br></div><div>Of course, I can recover the sets of ways presumed to be part of the imports at any time, so if people do want to perform more review/revision, there's a place to start. The review/revision would be welcome - MS building footprints are pretty bad - but I'm already half-way through another months-long OSM project and I'm simply not expecting to have the time to take this one on. Consider the address fix simply a bandage for the most obvious problem - which is that OSMand routing and navigation doesn't work in half of NY state despite a good-quality import of E911 address points. Gresham's Law of data applies here - bad data drives out good.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The big thing is really looking at the first chunk, and waiting a week<br>
after uploading it to make sure there aren't issues. Let it flow into<br>
nomimatim, main render, osmand live, other places and have a look.<br></blockquote><div><br></div><div>Yup. Exactly the process that I followed with NY public lands on multiple rounds (NYC watershed properties as an 'easy' first start of ~400 nature reserves, then near-total reworks of crusty old imports of NYSDEC properties and NYS parks&recreation&historic sites. I went really slow and careful on the first couple of rounds. Now when the state publishes an update, I can do the job myself in a week or two because the process is nearly automated: I get a JOSM session with data layers for the map (with changed objects preselected) and for the new geometry (already tagged alike). In the ideal case, I can pick the new stuff, copy-paste onto the map, Ctrl+Shift+G, and Bob's your uncle. The crazy quilt of protected areas that you see in the Adirondacks (and most of the protected areas elsewhere in the state as well) are largely the result of that effort.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">It also probably helps to have the code such that it can be rerun to<br>
produce a fresh candidate changeset, after some fixes.<br></blockquote><div><br></div><div>Hence the Github. The only part that I don't rerun is retrieving the original changesets from OSM. I'm close to incurring the wrath of the OWG as it is! Still, the scripts to do it are there; they simply check whether they already have the data in local files and refrain from re-downloading anything they already have.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">(Then later you could do a different process, which basically diffs the<br>
external data against osm and produces output files of where they<br>
differ, for hand review and thinking about. But that's not what you<br>
asked.)<br></blockquote><div><br></div><div>It's maybe what I should have asked, but for NY street and address mapping, Skyler already has that part of the problem solved, I think. His import of the address points was carefully tiptoeing around a fair amount of existing data.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Hope this helps - what we did and what you are doing are different of<br>
course,<br></blockquote><div><br></div><div>It surely does - if only to reassure me that I'm on the right track here. </div></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature">73 de ke9tv/2, Kevin</div></div>