<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=text/html;charset=iso-8859-1 http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.18812"></HEAD>
<BODY style="PADDING-LEFT: 10px; PADDING-RIGHT: 10px; PADDING-TOP: 15px"
id=MailContainerBody leftMargin=0 topMargin=0 CanvasTabStop="true"
name="Compose message area">
<DIV><FONT size=2 face=Arial>Thanks - that sounds like a good technique that I
could use in the future to catch pre-mapped areas.</FONT></DIV>
<DIV><FONT size=2 face=Arial></FONT> </DIV>
<DIV><FONT size=2 face=Arial> A bit of background - the National
Hydrography Dataset is divided into river basin and sub-basins. The
sub-basins are a manageable size to edit if necessary and upload.
Right now, active mappers are interested in uploading the NHD data in their
areas because it's instinctive for new mappers to map lakes and
rivers. That's a waste of their valuable time since the NHD data is
better than Yahoo aerial tracing in most cases. So there are multiple
people performing data imports for different areas.</FONT></DIV>
<DIV><FONT size=2 face=Arial></FONT> </DIV>
<DIV><FONT size=2 face=Arial> Back to this problem, the default NHD
scripts add the NHD database ID for each water body (ComID), and this simplifies
finding duplicate water bodies - I can compare ComID's instead of collections of
nodes, ways, and polygons. That search can even be done for
data that is already uploaded, except it is greatly complicated if someone has
edited one or both of the duplicates. </FONT></DIV>
<DIV><FONT size=2 face=Arial></FONT> </DIV>
<DIV><FONT size=2 face=Arial> I'd rather not soak up system time and
space importing something that's a duplicate. So I'll add a note to
the NHD Wiki page about this and crosscheck all the data I've
converted.</FONT></DIV>
<DIV><FONT size=2 face=Arial></FONT> </DIV>
<DIV><FONT size=2 face=Arial></FONT> </DIV>
<DIV style="FONT: 10pt Tahoma">
<DIV><FONT face=Arial></FONT><FONT face=Arial></FONT><BR></DIV>
<DIV style="BACKGROUND: #f5f5f5">
<DIV style="font-color: black"><B>From:</B> <A title=emilie.laffray@gmail.com
href="mailto:emilie.laffray@gmail.com">Emilie Laffray</A> </DIV>
<DIV><B>Sent:</B> Thursday, September 10, 2009 8:57 AM</DIV>
<DIV><B>To:</B> <A
title="mailto:niceman@att.net
CTRL + Click to follow link"
href="mailto:niceman@att.net">Mike N.</A> ; <A title=imports@openstreetmap.org
href="mailto:imports@openstreetmap.org">imports@openstreetmap.org</A> </DIV>
<DIV><B>Subject:</B> Re: [Imports] NHD Duplicate data</DIV></DIV></DIV>
<DIV><FONT size=2 face=Arial></FONT><FONT size=2 face=Arial></FONT><BR></DIV>
<DIV class=gmail_quote>
<BLOCKQUOTE
style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote> I'm not sure what the best solution is; a program to
detect duplicate<BR>areas may not detect duplicates from an adjacent upload by
a different<BR>person.<BR></BLOCKQUOTE></DIV><BR>
<DIV>For the Corine import of France, I have been using Postgis and some SQL to
calculate overlap between two surfaces. It is a mixture of ST_Area with
ST_Intersect. It is not perfect but we caught lots of existing forests inside
OSM so we didn't have to upload them. </DIV>
<DIV><FONT size=2 face=Arial></FONT><BR></DIV>
<DIV>I am not exactly sure what your problem is but we could have a look.</DIV>
<DIV><FONT size=2 face=Arial></FONT><FONT size=2 face=Arial></FONT><BR></DIV>
<DIV><FONT size=2 face=Arial></FONT> </DIV></BODY></HTML>