[Talk-ca] Aylmer/Hull QC: CanVec import overwriting existing edits
James Ewen
ve6srv at gmail.com
Sun Feb 20 18:42:35 GMT 2011
On Sun, Feb 20, 2011 at 9:41 AM, Richard Weait <richard at weait.com> wrote:
> I'm a advocate of "not importing." I like to think that I have
> tempered my default no-imports stance with a realistic compromise of
> the "well-considered, carefully executed, limited scope import, that
> might be a net benefit if everything goes perfectly". That message
> seems to get diluted when an enthusiastic contributor discovers an
> interesting dataset, and an import script; all they seem to hear is
> "Hey! Imports! Cool! Watch me go!!!!1!" I find that frustrating.
The "not importing" would be a bad thing for OSM... truly gathering
GPS traces and using them to map an area really is importing data.
Even drawing from memory could be considered importing. Of course
that's getting a bit silly.
I think the more accurate wording Richard alludes to is "automatic
blind importing".
Any work that we do on the OSM project really needs to have a set of
eyes that are connected to an intelligent brain go over the data to
ensure the best decisions are being made. Whether the source of the
information is local knowledge, personally collected GPS traces,
non-copyright maps, or government source datasets, it needs to have
someone look at what is being imported to the OSM database to ensure
things are happening in the best interests of OSM as a whole.
The road matcher script that was used to try and find existing roads,
and exclude the duplicates worked fairly well to try and keep from
causing some of the problems seen in Aylmer. I still find places in
Alberta where duplicate roads exist. Usually the culprit is the fact
that the first pass at creating roads for OSM were done by hand from
low resolution imagery. The road matcher script didn't associate the
existing road with the CanVec road, and the CanVec imported road was
placed in the OSM database. It takes manual intervention to correct
this issue.
When using any source data, one has to do due diligence in ensuring
that the information being imported into the database is the best
quality data available. If I were to set my GPS up to capture a trace
with one point every 30 seconds, and then blindly use that trace to
replace a high quality version of a road that already exists in the
OSM database, we'd probably hear the same complaints.
The CanVec data is a huge source for data that is available for import
into OSM, but that just means that we have a lot of data to verify as
we import it into the OSM data.
As Richard has mentioned, we have some powerful tools, we have huge
volumes of data available, but using the tools to import the data in
an ideal way is still an elusive goal. It takes some time and work to
get what we want to happen the way we want it to happen.
James
VE6SRV
More information about the Talk-ca
mailing list