[Imports] Proposed import cleanup: NYSDEClands

Kevin Kenny kkenny2 at nycap.rr.com
Thu May 19 14:35:13 UTC 2016


On 05/19/2016 05:27 AM, Paul Norman wrote:
> I was debugging some MP issues and came across the NYSDEClands 
> import[1], done in 2010, consisting of natural areas. They have a 
> number of unwanted tags[2], and a couple of other problems with their 
> tags
>
> Because there's a relatively small number of them, I think a 
> mechanical edit is the best cleanup option. I'm proposing the following
>
> - Removing NYDEC_Land:* tags
> - Removing area=yes where there are other area tags
> - Changing url=* to website=* where website does not already exist
> - Leaving source=* intact
> - Removing name=Unclassified
>
> [1]: https://www.openstreetmap.org/user/NYSDEClands
> [2]: e.g. https://www.openstreetmap.org/way/32002190
>
> _______________________________________________
> Imports mailing list
> Imports at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/imports

You are talking about an import that is near and dear to me: this is my 
home turf!

This import needs more significant rework than what you propose, and in 
fact, I wrote to Russ privately just the other day about it.

There's no update plan, although the source shapefile is updated regularly.

Multipolygons were rather a botch in the original import (for instance, 
most of Saranac Lake Wild Forest is missing (something I just recently 
noticed). The same thing had happened to Indian Lake Wilderness and 
Overlook Mountain Wild Forest.  Some of the multipolygons have 
topological problems.

Nodes are duplicated all over the place. Whenever I touch one of these 
areas (usually to disconnnect a way) it triggers a cascade of JOSM 
warnings.

Wildlife management areas are tagged merely with 'landuse=conservation', 
which is deprecated and does not render. They should be something more 
like 'leisure=nature_reserve boundary=protected_area protect_class=4', 
like the Pennsylvania State Game Lands. I would entertain an argument 
for a different protection class, maybe 14?

Wildlife management areas and multiple use areas have 'Wma' and 'Mua' in 
their names, which needs to be spelt out or at least capitalized.

The only real reason that I hadn't done it yet is that I'm trying to 
work up an update plan. I'm a good enough hand with PostGIS that I have 
a pretty good idea how to come up with the list of differences when a 
new version of the shapefile appears. But I'm ignorant of the tools for 
mechanical edits, and so it's been slow going. I'm quite meticulous 
about these things.

If you're willing, I think your time might be better spent on teaching 
me how to do the import.

That would have the pleasant side effect that I'd be able to do 
http://www.nyc.gov/html/dep/pdf/recreation/open_rec_areas.pdf by myself. 
I'm going to ask for the shapefile when I ask NYCDEP for permission. 
This is done out of a superabundance of caution, since permission should 
not be needed in any case. New York City's open data policy covers this 
data set, and we've done a good many other imports relying on the 
policy. If need be, I have scripts that have successfully web-scraped 
the individual maps for all of the areas except for Devasego Park in 
Greene County, the Ashokan day use area in Ulster County, and the Cross 
River and Kensico dams in Westchester County. The PDF maps for those 
four areas are images only - there's no vector polygon in them to be 
scraped. They're tiny compared with the others - more like city parks, 
while the others are huge tracts of mountain, forest and marsh.

Learning how to do this sort of job would also let me make faster 
progress toward getting the Adirondack waterways sorted. That's a much 
harder database job - it's a multiway conflation among what's already in 
there (a quasi-mechanical import of ponds, plus satellite image tracing 
of a handful of major rivers), NHD, the Adirondack wetland database, the 
USFWS wetland database, and the NYSDOT water shapefiles. Every single 
one of these has major errors and omissions, but conflating the 
redundant coverage actually promises to yield something fairly clean.

Generally speaking, survey quality in these areas is extremely poor. I'd 
say that the first topographic maps of the Adirondacks that were made to 
even the standards of the Victorian era were the ones produced in the 
1980s for the Winter Olympics in Lake Placid. And most online 
topographic map services don't even include them, because they were in 
metric units, used UTM rather than the state plane coordinate system, 
were referenced to NAD83 rather than NAD27, and were at 1:25000 rather 
than 1:24000 scale. They also were printed on rather poor paper, and 
shipped folded rather than rolled in mailing tubes. Because of that, no 
libraries have copies that really lie flat for scanning. They were huge 
sheets, since they were 7.5x15 minute double quads rather than 7.5x7.5. 
For that reason, a lot of their information simply never got digitized 
(or digitized well). A lot of the National Map stuff, NHD, and so on are 
actually based on the 1953 survey instead.

Those for whom a mechanical edit is never good enough will wait a very 
long time to get anything better in the Adirondacks and Catskills. I'm 
all for cleaning the state land boundaries up, since it establlishes a 
pipeline for authoritative data from the agency that manages the land in 
question. I realize that Frederik would say that if OSM isn't the 
authoritative original source, then the data don't belong in OSM. 
Whatever one believes about that, the import is already done and we 
might as well make the best of it.

-- 
73 de ke9tv/2, Kevin




More information about the Imports mailing list