[Imports] [Talk-ca] The great UUID debate (Was Re: 092G area)

Sam Vekemans acrosscanadatrails at gmail.com
Wed Oct 14 20:23:15 BST 2009


Thanks everyone for the comments. :)

I have updated the readme.txt file, and it will be available as included in
each of the .zip files
which contain the source SHP files and the converted .osm files, as well as
the various rules.txt files, compressed as a .zip, as well as the
changelog.txt file.

It's available here.
http://docs.google.com/View?id=d2d8mrd_261qb4kfqdv

I paste it here for comments.  I can also add a wiki link to it, so then a
revised version can be made more generic and internationally understood, for
consideration with all imported data can be made.   (I also listed the 1st
part, as this is an important part of the discussion also).

A lot can be said in MUCH less words, i know :-)

*******
re: CanVec Code retention

The CanVec code, is used as an indicator for searching for specific
geographical features within the Natural Resources Canada dataset.   This
is a way to find all the similar features (each with it's own unique UUID &
OSM reference number), and to change the key/value of each as potentiality
needed.   For example, if it has been found that there is a better way to
represent the map feature canvec:CODE=200041 where currently (example not
real) it says natural=water, then it should be natural=stream.   This change
should only effect those feature of 200041 as through out the Canada data
its referred to as the same thing.   There is also another canvec:CODE which
refers to 'lakes', so those tags do NOT need to be changed. (as it's a
different feature, with a different CODE)

HOWEVER, over time, once the source canvec data is known that it will no
longer change how it lists the data, then this feature could be removed from
the OSM database.   But until then, the canvec:CODE is used as a reference
back to the database to know exactly where this particular feature came
from.   (in example, the roads for GeobaseNRN) would just be 1 of the many
CODES, as canvec is a database collection of many different map features,
where GeoBase Roads is just roads (and different types of roads features).


*******
re: UUID retention

The UUID or (Universal Unique Identifier) is a alphanumeric code (for
example bb02c686968311d99c9f000ea65e52d8) which is attached onto each of the
map features, to give it an identification within the source data.

Similarly to OpenStreetMap, where every node, way & polygon also has it's
own ID which is automatically generated as each node; way; polygon is
created. For example "Node: 452814802".  This is used to give identification
to the data, and also to compare old with new data sets.  Ie. Comparing a
back-up copy of an OSM file with the current OSM data.

Although the UUID has no DIRECT actual significant use in the OpenStreetMap
it does however, value for the source datasets.

The primary purpose of the UUID is for use in comparison, when deciding on
what new data (from the same datasource) should be used to help improve
OpenStreetMap, based on new available data.

It is however noted, that any changes in the OSM database when are done
after the initial import.   These changes are to be RETAINED, and only with
DIRECT consent by the original contributor can they be modified.   OR if the
modifications to the map would be a 'significant' map improvement, (just as
in a local mapper doesn't need to contact everyone for all changes, as long
as the 'spirit of OSM' is retained, that is, to make the map better.

Because of the known fact that a VAST MAJORITY of the geographical area of
Canada, will largely remain either totally untouched, or just extra map
features added to osm, retaining the UUID, would be of help when looking at
new data available from that same source (in this case, Natural Resources
Canada)

Although (at this time) the exact method of comparing existing OSM data
which has been largely untouched, has not been directly explained with an
example.   The decision needs to be made in advance, (as to retain the UUID
or not).

SINCE, the ONLY current method (possibility) to compare future datasets, is
with a DIRECT COMPARISON of geographic locations of the map features.
Using OpenJUMP AutoMatch feature, this method is possible, where the results
are manually reviewed as errors are highlighted.

CONSIDERING that this is a one-to-many ratio (where it's only 1 person who
is actually doing the conversion, and only a few people who are doing the
vast majority of the 'copying in' and with a large number of people who are
viewing the available .osm files and adding features in locally.

THEREFORE, it is decided that the next revision of the (Natural Resources
Canada, GeoGratis product of CanVec) for the canvec-to-osm.bat script (as
well as the other Natural Resources Canada products from GeoBase products
(geobaseNHN-to-osm), and future scripts will include the UUID tags.

The tags a prefixed with the source
geobase:UUID=bb02c686968311d99c9f000ea65e52d8
and
canvec:uuid=bb02c686968311d99c9f000ea65e52d8 '

AND THEREFORE, ONCE, it is found that the updates that are available from
the datasource are small enough that these 'DIFF' files the list of features
to remove, along with the list of features to add), and with confirmation
from the datasourse provider, that the updates will be small, THEN the
updates can be done manually, and the UUID tags can be removed.

*******
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/imports/attachments/20091014/ac067a44/attachment.html>


More information about the Imports mailing list