[Imports] Proposed import cleanup: NYSDEClands
Kevin Kenny
kkenny2 at nycap.rr.com
Thu May 19 14:35:13 UTC 2016
On 05/19/2016 05:27 AM, Paul Norman wrote:
> I was debugging some MP issues and came across the NYSDEClands
> import[1], done in 2010, consisting of natural areas. They have a
> number of unwanted tags[2], and a couple of other problems with their
> tags
>
> Because there's a relatively small number of them, I think a
> mechanical edit is the best cleanup option. I'm proposing the following
>
> - Removing NYDEC_Land:* tags
> - Removing area=yes where there are other area tags
> - Changing url=* to website=* where website does not already exist
> - Leaving source=* intact
> - Removing name=Unclassified
>
> [1]: https://www.openstreetmap.org/user/NYSDEClands
> [2]: e.g. https://www.openstreetmap.org/way/32002190
>
> _______________________________________________
> Imports mailing list
> Imports at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/imports
You are talking about an import that is near and dear to me: this is my
home turf!
This import needs more significant rework than what you propose, and in
fact, I wrote to Russ privately just the other day about it.
There's no update plan, although the source shapefile is updated regularly.
Multipolygons were rather a botch in the original import (for instance,
most of Saranac Lake Wild Forest is missing (something I just recently
noticed). The same thing had happened to Indian Lake Wilderness and
Overlook Mountain Wild Forest. Some of the multipolygons have
topological problems.
Nodes are duplicated all over the place. Whenever I touch one of these
areas (usually to disconnnect a way) it triggers a cascade of JOSM
warnings.
Wildlife management areas are tagged merely with 'landuse=conservation',
which is deprecated and does not render. They should be something more
like 'leisure=nature_reserve boundary=protected_area protect_class=4',
like the Pennsylvania State Game Lands. I would entertain an argument
for a different protection class, maybe 14?
Wildlife management areas and multiple use areas have 'Wma' and 'Mua' in
their names, which needs to be spelt out or at least capitalized.
The only real reason that I hadn't done it yet is that I'm trying to
work up an update plan. I'm a good enough hand with PostGIS that I have
a pretty good idea how to come up with the list of differences when a
new version of the shapefile appears. But I'm ignorant of the tools for
mechanical edits, and so it's been slow going. I'm quite meticulous
about these things.
If you're willing, I think your time might be better spent on teaching
me how to do the import.
That would have the pleasant side effect that I'd be able to do
http://www.nyc.gov/html/dep/pdf/recreation/open_rec_areas.pdf by myself.
I'm going to ask for the shapefile when I ask NYCDEP for permission.
This is done out of a superabundance of caution, since permission should
not be needed in any case. New York City's open data policy covers this
data set, and we've done a good many other imports relying on the
policy. If need be, I have scripts that have successfully web-scraped
the individual maps for all of the areas except for Devasego Park in
Greene County, the Ashokan day use area in Ulster County, and the Cross
River and Kensico dams in Westchester County. The PDF maps for those
four areas are images only - there's no vector polygon in them to be
scraped. They're tiny compared with the others - more like city parks,
while the others are huge tracts of mountain, forest and marsh.
Learning how to do this sort of job would also let me make faster
progress toward getting the Adirondack waterways sorted. That's a much
harder database job - it's a multiway conflation among what's already in
there (a quasi-mechanical import of ponds, plus satellite image tracing
of a handful of major rivers), NHD, the Adirondack wetland database, the
USFWS wetland database, and the NYSDOT water shapefiles. Every single
one of these has major errors and omissions, but conflating the
redundant coverage actually promises to yield something fairly clean.
Generally speaking, survey quality in these areas is extremely poor. I'd
say that the first topographic maps of the Adirondacks that were made to
even the standards of the Victorian era were the ones produced in the
1980s for the Winter Olympics in Lake Placid. And most online
topographic map services don't even include them, because they were in
metric units, used UTM rather than the state plane coordinate system,
were referenced to NAD83 rather than NAD27, and were at 1:25000 rather
than 1:24000 scale. They also were printed on rather poor paper, and
shipped folded rather than rolled in mailing tubes. Because of that, no
libraries have copies that really lie flat for scanning. They were huge
sheets, since they were 7.5x15 minute double quads rather than 7.5x7.5.
For that reason, a lot of their information simply never got digitized
(or digitized well). A lot of the National Map stuff, NHD, and so on are
actually based on the 1953 survey instead.
Those for whom a mechanical edit is never good enough will wait a very
long time to get anything better in the Adirondacks and Catskills. I'm
all for cleaning the state land boundaries up, since it establlishes a
pipeline for authoritative data from the agency that manages the land in
question. I realize that Frederik would say that if OSM isn't the
authoritative original source, then the data don't belong in OSM.
Whatever one believes about that, the import is already done and we
might as well make the best of it.
--
73 de ke9tv/2, Kevin
More information about the Imports
mailing list