[Talk-ca] OSM GeoBase import: giving a try
Richard Degelder
rtdegelder at gmail.com
Mon Feb 16 01:13:42 GMT 2009
Frank,
Welcome to the project.
> Yes please try to keep nids for data that actually comes from geobase.
> Not importing could limit us in the future.
In what way would it limit us? When we'll receive a new dataset from Geobase? Or
do you hint towards other datasets which are linked to the NID? In that case
that additional data can't be linked to existing database, because that doesn't
contain the NID attributes.
The reason for keeping the NIDs intact is with any future updates and imports coming
from GeoBase, and apparently GeoGratis and CanVac. For OSM itself this data is
meaningless, the renderers have no idea how to deal with it and so ignore it. Other
imports are going to, unless they have the same NIDs, also not use the data.
GeoBase regularly does updates to the data sets and are looking to complete the current
data sets for all provinces. Their reference value is the NID. As long as we
also ensure that the NID is present we are going to be able to easily incorporate those
updates when they appear.
>> b. How can we guarantee that the final import will be consistent?
>> (See also
>> my first question.)
>
> Depends what you mean by consitent.
> 1. Using consistent tags for things, the best way to ensure this is to
> look at what others are doing and share your scripts.
>
> 2. data consistency, ie roads line up and are joined between different
> imports or between the existing and geobase data. Right now we have no
> automated consistency tools, we are depending on people to manually line
> up/connect ways post import.
Both meanings were intended. I wonder if automatic consistency will work. It
seems to have a too high chance of failure.
Canada is too big, and there is too much data to incorporate everything manually. If we
automate as much as possible the more effectively we can incorporate the available data
as quickly as possible. The problem is finding an effective way to automate as much
as possible without having too many errors incorporated. Not everything that we are going
to want is going to be available from GeoBase, and, in many areas where we have a number
of mappers, we are also able to incorporate data before it is available from GeoBase, or
even available to GeoBase.
What we are currently having to do is to have people actually look at the data to clean
it up after an import. If we can eliminate that to a large degree, possibly through some
form of automation, we can speed up the import process. But until we get to that point
we are going to have to have manual intervention. To some degree, however, there is
always going to be the times where someone will edit, and modify, data currently within
OSM and that is part of the process.
One final question: what is the accuracy of the Geobase data? Is it worse,
comparable, or better than typical GPS tracks? The roads I've drawn so far match
the Yahoo imagery quite well, although usually there is a slight offset of a few
meters.
I have been looking at a lot of the data produced for me by Steve and comparing it to
the Yahoo! imagery. I have also found an offset, although it seems to be variable and
inconsistent. The GeoBase data comes from a number of sources, each with their own
accuracy. Looking at the data it almost seems that the people that entered it have a
real influence into the quality. I have found some minor curves marked with a lot of
data points while in other cases a more significant bend is marked with only a few. But
then the same occurs within OSM and can also be credited to the style of the contributor.
As for the overall quality of the data from GeoBase I am happy to incorporate it within
OSM, it is certainly better than a lot I have seen, including from GPS traces. But it
does depend on the source of the data and the effort of OSM users, when setting up for
GPS tracks, can get close enough to the overall quality to match much of it. If you are
very careful when creating your GPS tracks, and have a good signal and good equipment,
you are going to exceed the quality of some of the GeoBase data but in some cases you are
only going to match it with your best efforts.
Have fun importing the data. And if you can develop a best practises that can import the
effectively and accurately, hopefully without overwriting the current OSM data, then
let others know so that we all can incorporate those techniques. We are learning as we
go and most of us are very willing to learn a better method of being able to import, and
use, the tremendous quantity of data that we have available to us with GeoBase.
We were fortunate to have access to the data and now we are going to have to find a way
effectively assimilate it within OSM. There are going to be minor issues but hopefully
we can manage them easily enough.
Richard Degelder
More information about the Talk-ca
mailing list