[Talk-ca] OSM Geobase import: giving a try

Frank Steggink steggink at steggink.org
Tue Feb 17 02:31:32 GMT 2009


Hi Richard, Sam, Steve,

Thanks for your feedback. I'll continue using PostGIS (good for my own skills as 
well), and look at the other suggestions you've made. I'll also leave the NIDs 
on the data which I will eventually upload. They don't hurt, although there is 
the theoretical possibility that due to subsequent edits they get shifted 
(splitting and joining).

I'll also investigate PostGIS buffering a bit more. 10 to 20 meters should be 
enough. In cases of roads which have been shifted 100 meters, they need better 
investigation anyways. In case they turn out to be drawn from low-res imagery, 
they can better be replaced by the Geobase data. Hopefully the buffering won't 
cause data to be removed accidentally. For areas where the OSM density is quite 
small (as in my test areas) it is no big deal to remove duplicate Geobase roads 
manually. Better to be safe than sorry. How would RoadMatcher deal with a case 
where two roads are aligned, but one of them has been shifted?

With regard of this I also think a manual step should always take place. (Maybe 
even to evaluate discarded data.) We must only make sure that it covers those 
parts which can't be done automatically, so the required amount of time spent of 
it is as small as possible. Our brains can still make judgments better than the 
computer. I don't think anyone has very sophisticated algorithms which eliminate 
human input in the process ;)

I hope to really start with the Geobase import soon. I came across the Global 
Administrative Areas (GADM) database, which is part of the BioGeo project 
mentioned at the potential datasources. I've looked more closely at the data, 
and exchanged some ideas about the import of this data as administrative 
boundaries with the guys on #osm-nl. (Sorry, sometimes it's necessary to chat in 
my mother language ;)) I also inquired after use of the data by OSM, because of 
their copyright (CC-BY-NC-SA 3.0), and because they aggregated data from various 
other sources.

Regards,

Frank




More information about the Talk-ca mailing list