[Talk-ca] Question about NIDs

Richard Degelder rtdegelder at gmail.com
Wed Jan 21 23:43:21 GMT 2009

On Wed, 2009-01-21 at 12:26 -0800, Sam Vekemans wrote:
> Hi all,
> Here's IMO, based on the last digest.
> I tried to cover that with my video conference trial-run, and got
> stuck.
> Lets look at the facts.
> The RoadMatcher script AFAIK does NOT actually look at the NID when
> looking at existing OSM data and deciding on weather or not it should
> be added. ... is this true?

True, the RoadMatcher application does not look at any of the attributes
of the OSM way or the GeoBase Roadsegment.  What it looks for is the
location of nodes that define a roadway, an it does so with a bit of
leeway so if the nodes in the OSM way and GeoBase Roadsegment are not in
the exact same place but are close it will say that they represent the
same node.  It also does not require the same number of nodes, or in the
same place, as long as the basic profile of the roadway within OSM and
GeoBase match RoadMatcher considers it a match.

> For updates, the road matcher script can be used, and looks at all the
> OSM data (imported & existing) and treats the existing AS IF if were
> OSM created. It only looks at its  coordinates, the osm roadtype &
> road name tags.

Roadmatcher looks for the roadways to match and really never considers
the extra data, such as road name, number of lanes, GeoBase NID, etc,
that goes along with it.  That is all meta data, something that is
important but not for the function of RoadMatcher.

So sure RoadMatcher can be used during an update but I doubt that it
will be the prime tool for the update.  It will be used to locate
roadways that are added, and by extension compare the current OSM data
against the GeoBase file, or GeoBase update file probably, to the OSM
since the last update/import.  In this respect it will be used in much
the same way it is right now, although there will be much more OSM data
to look at and there will be very few new roadways that are not going to
be matched between the OSM file and the GeoBase update.

For updating things like street names where they are not available, but
we have imported the GeoBase NIDs into OSM, it is likely easier to write
a new script that creates a new attribute, and possibly checks for one
first, with the appropriate data for that NID.  It can be a simple
search that looks for an NID and then, after ensuring that the data does
not already exist, inserts a new attribute for that way.

> For situations where road matcher got flustered and didnt import the
> road, but should have.  This is why the origional import file is made
> available to use;
>   We can load the GeobaseOrigional file, and compare it with existing
> data. ... just by having (in JOSM) both layers on, (with GeoBase as
> the shadded one) you can pan around the area your working on and right
> away see what needs to be added.   You then just select those items,
> then copy. .. then hide that layer, then paste on the osm layer. then
> upload changes.

Steve made the original import file with all of the roadways, both those
matched and those that are standalone, available not so much to use for
adding the roadways that are not currently within OSM but for a
reference.  There is really no reason for the complete file to be made
available or to be used for the update/import.  Once we are comfortable
with the process the only files that are going to be needed are the
standalone file since it has all of the required changes.  The
standalone file may, on occasion, miss a street that should be in it
because it is not already represented within OSM but that is a minor
inconvenience, especially since there are really very few of them.  In
fact, while doing some testing, I found that there are more roadways
that are in neither OSM nor GeoBase because the streets are too new to
have had a chance to propagate through the system to already be in
GeoBase, than were missing from the standalone file that should have
been there.  There are also more streets within OSM that are not yet
available within GeoBase as well.

As for the technique you describe for using JOSM the file you should be
using is the standalone file that Steve creates.  Your technique is good
if you were to want to manually import the GeoBase data.  But Steve
takes the standalone file and imports it directly into OSM so at that
point all of the current OSM data and the new GeoBase data are together
within OSM.  You are very unlikely to find anything that was missed.

As Steve becomes more proficient at tuning, and to a degree
understanding, RoadMatcher he is going to be able to have it get better
at finding all of the missing roadways from OSM, that do already exist
within GeoBase, without having any duplicates and then have them
imported into OS with another script.  At this point, and Fort McMurray
was really test case because it is fairly small and has a single person
who has uploaded the vast majority of the data and is willing to
experiment with the import, we are still finding what it will take to
make RoadMatcher as efficient as possible and see what issues are going
to arise from any import attempts.

I have at the same time been playing with some files from RoadMatcher
that Steve generated for me and I am trying to find why RoadMatcher
fails to match existing roadways within both data sets.  I am not having
a lot of luck because I am finding very few roadways that should have
been matched that are not and I am really finding no duplicated roadways
that are showing up in the standalone file.  I am using the basic
technique that you described for using JOSM, except I am tracing over
the standalone file in the same way as if it was Yahoo! satellite
imagery, and then importing it into OSM for further attempts to match
the roadways.  My goal is to have a file with only those roadways that
are problems show up in the standalone file.  From there we can see how
to improve RoadMatcher to catch those as well.  The area that I am
looking at is much more extensively mapped than Fort McMurray, however,
and so is going to be a better test for looking at when we are going to
import GeoBase data into heavily mapped areas like Toronto.

We are not going to be able to ignore the well mapped places like
Toronto, Montreal, Vancouver, etc for a number of reasons.  Although the
areas are better mapped than most of the country, and so really benefit
from the GeoBase import less than the very sparsely mapped regions of
Canada, there are still things that are being missed even in these
areas.  And the wealth of extra data that GeoBase currently offers, and
especially will offer in the future, will benefit these areas as well.
> So this has nothing todo with the NIDs.  ... NID's are only there for
> making that 1st conversion, when running the script to make the OSM
> file.  ... once it's and OSM file, it's the OSM tags that are
> relevant.
The NIDs really have no relevance for anything other than GeoBase.  They
have no relevance for OSM or our current import.

Think of the NID as the serial number for a car, the VIN.  There are
really very few times that you would ever refer to it.  Your car is not
the 1FALP......123456 car but the "blue Taurus parked over there".  Even
your mechanic refers to your car as Sam's blue Taurus.  But Ministry of
Transportation for your province wants to have the VIN when they give
you a set of license plates, or renew them, or to transfer ownership.
If the police really want to determine who owns the blue Taurus they
will run the VIN.  And the VIN is used to record the history of the car.

Most people will never even know where to locate the VIN, although if
you remind them they will remember, so in almost all cases the number is
irrelevant.  The same is true for the GeoBase NID.

OSM has no real need for the NID, it does not do anything within OSM and
nothing within OSM knows how to use it.  As far as OSM cares it is just
another little piece of data that someone added and it will never pay
attention to.  OSM does not know how the NID was created or generated
and it does not know the source for it.

Where the NID is relevant is if we ever want to do an update, with new
data, from GeoBase.  GeoBase is the only entity that really uses the NID
and they use it in the same way as a VIN for a car.  All of their data
gets ties to an NID, the road name, address range, postal code in the
future, Canadian Census data, etc.  If we ever want to add more data to
the map from GeoBase then we are going to have to have the NID as one of
the attributes for the way within OSM.

If we are only going to treat the GeoBase import as a one time event,
once we have finished going across the country we no longer pay
attention to what GeoBase has to offer, then we do not have to consider
the NID either.

RoadMatcher does not pay attention to the NID nor does OSM.  The NID is
only valuable for GeoBase and it identifies a particular roadsegment as
unique for their purposes.  It is the NID that GeoBase uses to identify
a street, or more likely a portion of a street, for everything.  Main
Street in any city, and even every block of every Main Street, has a
unique NID which allows GeoBase to know exactly where they are talking
about.  The GeoBase NID is the VIN for portions of streets with no two
portions ever having the same NID just as every vehicle has an unique
VIN and referring to a VIN identifies a unique vehicle.
> Re:  pieces of ways in a line

The problem is that an NID is for the distance between two intersections
on a roadway, that can be a very short distance or it can be very long
depending entirely on the distance between intersections, and because
the NID is for a particular block, rather than the entire street, we
have to be able to differentiate which portion of the street the
particular NID is for.

> Because it's only the geobase ways that are not already in OSM that we
> are dealing with, it would be fine to be joining the ways as longer
> road segments.   ... this is conforming to OSM standards.

The import is concerned with both the GeoBase roadsegments, these are
the things we want to import if they are not already in OSM, and the OSM
ways because these are the things we need to compare the GeoBase
roadsegments to see if they already exist.

My issue is that if we are going to use the NIDs in the future we are
going to have to be able to determine where they actually apply to.  If
we can add relationships that limit the extent of the NID then we can go
with ways that are as long as the roadway.  If we cannot determine an
effective and efficient means to add relationships onto a longer way
then we are going to have to break the ways into the same size as the
GeoBase roadsegments and use each to hold the NID.  If we choose to do
niether then the NIDs become meaningless for future reference and the
GeoBase import becomes a one time event because it will be just too
difficult to go through the process of determining where the NIDs should

> OSM does NOT need to conform to GeoBase standards at all.  This is a
> one-way import.

Within any context OSM is far larger than GeoBase and we are taking what
we want form GeoBase to augment OSM.  In reality both are going to live
long prosperous careers without each other and neither is dependant on
the other.  GeoBase just makes a relatively complete map of Canada
happen a lot sooner but we are not dependant on GeoBase.

There is one aspect in which we must conform to GeoBase, and then only
if we want to update the map using GeoBase data in the future, and that
is in respect of the NID.  GeoBase has a very specific use for the NID
and if we want to do updates and augment the map further in the future
we must do it with reference to the GeoBase NIDs.  And because OSM has
no use for the GeoBase NID we are not worrying about conflicts between
OSM and GeoBase over the NID.  For OSM the NID attribute is just another
attribute that the renderers do not understand and so ignore.
> Conclusion:
> For updates, because AFAICT the script looks at it's 1 - relative
> coordinates, 2 - road name and 3 -road type to see a match.   There is
> no way that it can create a new NID for existing ways, then compare
> that.  The would be illogical.

Because we do not yet have any update scripts it could do that.  What is
more likely is that if the data that we want to update the map with,
lets say Ontario gets street names and we want to populate the streets
with street names, we would first run RoadMatcher against the OSM map to
find any new roadways that are not currently within OSM but available
from GeoBase.  These new roadways would be added to the OSM map to make
it a little more complete.

Then we would have to determine which roadways were added by users and
not given a GeoBase NID, or where a user edited the NID or deleted it.
These would be given their NID, or given a new one if it never had one
previously and GeoBase has one for it.  At this point all roadways would
have a NID assigned to it by GeoBase.  You are right that we would not
want to create a new NID for ways that do not have one assigned to it by
GeoBase.  Thee is also, unfortunately, no way to determine if the
GeoBase NID is valid or not with a simple check.

At this point the new GeoBase data would be added based on the NID that
is found for that portion of the way.  So it 123456890abcde1... is ain
Street then we would first check that there is not an attribute for name
and if there is not then add a new attribute for the way with "name:
King Street".  Then we go to the next way, or portion of the way, with
the next GeoBase NID.

In fact, when we are running update scripts I do not want to have the
script function as you described at all.  I do not want to have it look
for the road name, or the road type.  If a user decided that the road
name was incorrect, or that it had been changed, and corrects it I want
to be certain that those changes are retained in the future.  I do not
want to have any script change an attribute that a user has created or
modified.  The same goes for road type.  If I am going to be standing
there looking at a roadway, or I have been driving it, I feel that I am
a better judge of the road type than someone who is trying to find an
appropriate category for it but has never been on it or seen it before.
It trust the users of OSM to be a better judge for the attributes than
GeoBase and I am also pretty certain that if a user marks the attribute
wrong that someone will come and fix it sooner or later.

> For future post-codes import;
> The script would need to look up the relative coordinates, road name,
> road type and add in this feature as a NODE.
> If the post-codes are available the as polygons, then GREAT the
> polygons would be able to be imported with no interference... the
> borders of the polygon would be boundary lines anyway (not roads)

You are probably right but at the same time if GeoBase uses the NID as
an identifier then why not use it?

> For future house-numbers
> The best way to deal with it, is to import it as nodes. .. and if the
> node doesnt line up with the road. .. no big deal... as long as it's
> tagged right, the search engine should be able to pick it up.

GeoBase is going to DEFINITELY be using the NIDs for this and if we
ignore the NIDs with the import process then we are going to lose the
ability to import this data from GeoBase.

> For road names:
> I think that Roadmatcher would be able to pick it up, as people could
> be importing the roads (of Ontario) without names. ... but we EXPECT
> or HOPE that those who import data, ONLY import the roads that they
> intend on adding the NAME tag. .. so the script should have no problem
> looking at it, so where the road name was spelled differently.  The
> OSM version would Trump it.  ... no NID's would need to be added to
> OSM data.

Most of the south half of Burlington ON was traced from the Yahoo!
satellite imagery and has no names.  I added a few but that makes up a
large portion of the named streets.  I have seen the same thing happen
in Toronto where there are unnamed streets throughout the city.  And
most of the Niagara Peninsula, which has been extensively mapped
recently, does not have street names.  That is going to be a big cleanup
job for where I know about and I imagine that it is the same throughout

Having the GeoBase data provide names is going to fill in a lot of names
throughout Canada.  Recently I have been adding a lot of roadways
without names but that is generally an anomaly for me because I hate
doing it.  Also, street names from GeoBase could allow us to catch when
a roadway has a name change in an area that has few mappers.  It would
benefit us in exactly the same areas where we are now looking at
importing almost all of the roadways because there is nothing there now
and no mappers to fill it in.

> Hope this makes sense,  (had to sleep on it) :-)

The NIDs have value only if we are looking to augment OSM from GeoBase
in the future, after the initial import.  If we are not going to then
importing them is not required.  But everything from GeoBase references
the NID and so if we are ever going to go back to GeoBase and retrieve
additional data for the areas we have done then it is going to be
imperative that we find a way of treating the NIDs in exactly the same
way as GeoBase does.

The NID is an address, an address for an entire block, in the same way a
house has an address.  Sending mail to Sam, with no further address or
means to identify you and where you are, is not going to be effective.
You are not likely going to get an awful lot of the mail intended for
you and you will get a tremendous amount meant for other Sams.  If I
throw a letter into the mailbox with the address being just "Sam" I can
count on it being thrown out eventually and long before you could see

> Cheers,
> Sam Vekemans
> Across Canada Trails

Sorry I missed your broadcast but I had a meeting I wanted to attend.  I
hope it went well.

And James; I have never been able to move things between layers in JOSM
ad my method is to trace over the things I want.  The attributes would
then be copied by hand if I wanted them, which is a pretty tedious
process and so not something I would want to do for the entire import.

Of course merging the layers brings things between them but then
everything gets merged and there is not the ability to select what you

Richard Degelder

More information about the Talk-ca mailing list