[Talk-ca] Sam's summery essay (was Re: Correcting Geobase_import_2009)
ve6srv at gmail.com
Thu Oct 29 00:10:11 GMT 2009
On Wed, Oct 28, 2009 at 1:19 AM, Sam Vekemans
<acrosscanadatrails at gmail.com> wrote:
> This message is directed to the talk-ca list, as it serves as a summery for
> the latest and greatest. In a month or so, I'll be able to summarize in 1
> page. But for now, I've put a lot of thought into the below message, so,
> although long and rambley... it's the best answer i got :)
Wow, Sam... I made it through the whole spiel, and even stayed with
your thought process through the whole thing... that's a first! 8)
> Here's the low-down. (social impact)
> We respect the integrity of the local area mapper who spent a considerable
> amount of time either tracing from imagery, or tracing from there own GPS
> tracks.... and place this on a HIGHER priority than that of geobase/canvec
> So, again... this is Openstreetmap, where its a collaborative community who
> builds the map. ... we respect the integrity of the local area mapper who
> spent a considerable amount of time either tracing from imagery, or tracing
> from there own GPS tracks.... and place this on a HIGHER priority than that
> of geobase/canvec data.
I think this type of statement is what is causing problems. We should
not use a blanket statement that OSM data of any quality is sacred...
OSM data is a living database that everyone can work on. Data that
you, I, or anyone else enters into the database is not locked into the
database never to be modified. If another user comes along and wants
to add tags, modify the way to (hopefully) increase the accuracy of
the data, or even remove the data should the real world object the
data is representing should be removed or destroyed.
The issue is that data being imported by a bulk import script should
not be blindly imported damaging or destroying work that has been done
by a real live OSM user. The key concept in that statement is BULK
If a user has the complete GeoBase file for the area and is putting
the time and effort into verifying and checking the GeoBase data
versus the OSM data, and comes up with the conclusion that the GeoBase
data does a better job of describing the way, then they should feel
free to modify/remove the lower quality OSM data, and copy the better
quality GeoBase data into the OSM database.
Another concept to remember is that there does not have to be an
exclusion clause. One does not have to choose to go with only OSM data
or GeoBase data. One could use a high resolution OSM GPS trace based
way, and copy the GeoBase tags onto the OSM way. There's also the
possibility that some of the tags on a low quality OSM way might be
useful if copied onto the higher quality GeoBase way.
What we need to do, is to take the best data that we can find from
whatever source is available (that meets OSM guidelines), and merge
that into the database. The bulk import scripts are written to do
that, but only where an easy decision can be made, which is based only
on the easily determined logical choice... Is there any existing OSM
data at this location? If the answer is no, then import the GeoBase
We need to have real people make the harder decisions where the
GeoBase data and OSM data overlap. That's where we are at in areas
that have been imported, and people are seeing holes between the
GeoBase data, and OSM data.
Feel free to get your fingers dirty... get in there and make an
informed decision about what data to include in the OSM database. Just
don't blindly wipe out existing OSM data to import a bunch of bulk
It's not a GeoBase versus OSM issue, but rather a data quality issue,
and it is up to the OSM community to get in there and determine which
data has the best quality, and if required merge both sources to come
up with an even better final product.
I did the same type of thing when I was tracing hundreds of kilometres
worth of highways with my GPS. I would upload the GPS trace to OSM,
and then manually work my way along the highway checking my trace
versus the OSM way. I would copy the tags from the OSM way to the GPS
based way, I'd chop the GPS trace into pieces where I turned off one
highway, and onto the next. Using aerial imagery, I would insert
bridges, or other things that wouldn't be contained in a GPS trace.
I didn't just wholesale delete every road in the area so I could
upload my data. I used the GPS trace as another source, and using my
knowledge, made the best decisions to improve the OSM database.
Here are some examples...
There was a very rough trace of the Alaska Highway done from the low
resolution Yahoo Imagery available. When I travelled that portion of
the highway on my way to the Maxhamish Lake area, I recorded my GPS
track. I uploaded and converted that track in changeset 718156 . I
copied tags from the existing way, and then converted my trace into a
way. I connected the new highway to the existing side roads, and
closed the changeset. Another user tcjfr has been poking at the way,
making modifications, and improving the database since then as can be
seen in the history of way number 27400400 .
This is type of thing that we should be doing with the GeoBase data
where OSM data exists.
Another highway, #77 the Liard Highway (27346793 ) did not even
exist in OSM when I drove it. I simply used my GPS trace to create the
way, and added my own tags.
This is similar to what the scripts are doing, where no data exists,
just get after it and put new data into the database.
We need to ensure that we don't put out the wrong message! The message
is not "If OSM data exists, GeoBase data has to be thrown away.", but
rather if OSM data exists, we need to make intelligent decisions on
how to incorporate the GeoBase data into the OSM data through a
We need to still ensure that blind bulk imports do not damage or
destroy existing data (from whatever source).
More information about the Talk-ca