[Talk-ca] Cleaning up after the GeoBase import

Richard Degelder rtdegelder at gmail.com
Sat Jun 13 02:54:28 BST 2009


William Lachance wrote:

Look at this from another angle: Should we split up all the existing OSM
road data that people have put in to add in GeoBase UUID information?

The simple answer is that at some point we are going to have to.

If we want to add the attributes available from GeoBase, and to be able to
update it from future GeoBase updates, then we are going to have to find a
way to add the GeoBase UUID information and, to do so, split the ways into
the came segments that GeoBase uses.  If we do it manually, which will be a
lot of work, or someone develops a script to do it we are going to have to
do it to utilize the data available from GeoBase for the map.  Not doing it
means we cannot import the GeoBase UUID an cannot benefit from the other
attributes available within the GeoBase data which will leave us with two
classes of roadways, those that have all of the current data available and
those that are going to require that users manually add any additional
details in order for them to appear.

At some point we are going to have to go through the entire data set within
Canada in OSM and look to add the GeoBase UUID and any additional data from
GeoBase.  We will not need to give attribution to GeoBase for data submitted
from users nor the method that the GeoBase data was recorded but we will
have to insert the GeoBase UUID for those segments.  From that point those
ways originally created by users will be handled exactly the same as every
other segment originating from GeoBase.



Michael  Barabanov suggested:

Let me be a devil's advocate here for a while.  The 2 alternatives
that make more sense are

- Delete the existing street data and start fresh with GeoBase, and always
   maintain the UUIDs properly. Given how many inconsistencies/topology
   errors there are in OSM data for Greater Vancouver, it may just be
the right thing to do.
   Otherwise we'll be spending months joining exising OSM data to newly
   imported segments. Not a great way to spend resources.

- Don't pay attention to preserving UUIDs/segments, and re-run
RoadMatcher when we need to import more data.

What we're doing right now is neither here nor there. No benefit of
GeoBase/UUIDs for existing OSM data and confused renderers for the
imported GeoBase ones.

Michael.

I would suggest that we do neither.  Although the GeoBase data is generally
very good, frequently excellent, there are times where it is wrong or less
than ideal.  This can be for a number of reasons, many not within the
control of GeoBase either.  Where we have had good people working on OSM,
using GPS tracks as the basis of their work, we usually also get pretty good
quality, maybe not to the exacting quality that GeoBase has the potential to
get.  When we started the import we agreed that we would preserve the user
generated data as much as possible.  One of the guidelines for mass imports
also is to preserve user generated data.  Where the user generated data is
faulty or wrong then we are going to replace it, in the same way that we
would if we were manually editing the data.

One of the reason that the GeoBase data may not accurately reflect what is
really there is because it is only updated periodically.  Thus new roadways,
changes to existing roadways, and other changes are only going to be
reflected in the GeoBase data sporadically.  It takes time for the original
data to be transferred up from the municipal level, where much of it
originates, through the provincial level and finally into GeoBase itself and
allow GeoBase to release a new update for the province.  We want to have
users continue to map and add to the map to keep it really up to date.

Some of the data within GeoBase is also pretty old and generated from poor
quality sources, some simular to the Yahoo! satellite imagery that we also
have access to.  If we can get a local mapper to correct the data then it
can be better than the original GeoBase data and improve the map.  I will
frequently take good locally generated data over that from GeoBase and so I
would be very reluctant to support the idea of wiping out locally generated
data in favour of that from GeoBase.

If, on the other hand, we ignore the preservation of the UUIDs and segments
and instead merge them then we are going to go through a great deal of work
every time we want t add a new feature from GeoBase.  Are we going to create
the new GeoBase compatible data set on the OSM servers, and then merge the
data again afterwards?  The history of the areas we do that in is going to
take up far more resources than keeping the current GeoBase UUID data as
part of the map.

Richard Degelder
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-ca/attachments/20090612/120c7298/attachment.html>


More information about the Talk-ca mailing list