[Talk-ca] Geobase NRN import script

Jason Reid osm at bowvalleytechnologies.com
Fri Jan 2 00:28:54 GMT 2009


James Ewen wrote:
> On Thu, Jan 1, 2009 at 4:17 PM, Jason Reid
> <osm at bowvalleytechnologies.com> wrote:
>
>   
>> The script will generate a single OSM file per GML file, so 1 per
>> province currently.
>>     
>
> Any plans on how to merge the OSM and GeoBase files into one database?
>
>   
>> The mapping between the Geobase tags and OSM tags is a work in progress
>> still, it roughly follows whats on the wiki, both on the geobase import
>> and the Canadian tagging guidelines. This is the largest area of the
>> script that needs refinement yet.
>>     
>
> I think this is going to be the single biggest problem once we have
> the database merged. Do we stick with what GeoBase has the roads
> tagged as, or do we change the tags to meet what the OSM descriptions
> are? I see in the script that you are going to be overriding the
> GeoBase tags simply based on highway numbers.
>
>  # Primary (5-100)
>               if ref >= 5 and ref <= 100:
>                 self.tags['highway'] = 'primary'
>
> # Secondary (500-899, 901, 940)
>               if ref >= 500 and ref < 900:
>                 self.tags['highway'] = 'secondary'
>
>  # Tag Transcanada/Yellowhead as trunk
>               if ref == 1 or ref == 2 or ref == 3 or ref == 4 or ref
> == 16 or ref == 35 or ref == 43 or ref == 49 or ref == 201 or ref ==
> 216:
>                 self.tags['highway'] = 'trunk'
>
> Portions of Highways 2 and 16 near Edmonton meet the description of
> Motorway. If we change the attribute to reflect the fact that the
> highway is a restricted access major highway with access ramps, how is
> that going to affect future update imports?
>
> I've looked at the NRN, and seen discrepancies between what is in the
> database, and what exists on the face of the earth. It's going to be
> interesting to see how much the GeoBase database gets modified by OSM
> users to fit their idea of what's out there.
>
> That being said, I don't think this should stop the progress being made.
>
>   
>> You can see some of the initial converted data rendered on the map at
>> http://openstreetmap.ca/map/. This map will be updated periodically as
>> things progress, as a test bed to make sure that things are working. So
>> not all provinces are there yet.
>>     
>
> What I see looks good! I've poked around southwestern Manitoba
> checking out the area where my Dad grew up, and can identify the towns
> just by the highway grid, and the roads in town. Add in hydrography,
> some railway and placenames and it's starting to look like a real map
> that you'd have to pay money for! The road import is looking really
> good. I can't see the attribute tags on that site, but it's a visual
> treat to see all that data.
>
> Can you slice the GML file up into chunks? We could manually start the
> import process by manually defining areas where we want to start
> importing data. Areas chosen by users with little or no existing data.
>
> James
> VE6SRV
>
> _______________________________________________
> Talk-ca mailing list
> Talk-ca at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-ca
>   
In terms of breaking the upload into smaller pieces, we can either slice 
the GML file, or slice the OSM file (using osmosis). Both would be about 
the same in terms of complexity. Either way there will still be an issue 
of duplicate nodes and/or ways where the boundaries are crossed, which 
was the reason for initially wanting to see if we can break it up into 
smaller jurisdictions using the data in Geobase. I believe that the 
TIGER import had this same issue with duplicates along the 'seams'. 
Worst case if we split it up into blocks of a certain size it wouldn't 
be too hard to tell it to start importing the lesser populated areas first.

In terms of actually adding the data into OSM, the only real solution is 
to use the API and the bulk_upload tool (or a tool with equivalent 
functionality). The upload will take quite a bit of time to fully run, 
even if we had all the provinces with full data when we begin. Theres no 
way that we can simple push a button and have it all in the main OSM 
database. Depending on how long it will be until the developers who are 
working on the new API think it will be ready, it may even be an idea to 
hold off the import until it is in place so we can take advantage of its 
new versioning functionality. I'd personally recommend waiting, just as 
I dislike the thought of having to either re-run the export after we've 
already imported some of the data, or convert the raw datafiles.

The tagging will be the single largest issue of contention, not only for 
the upload but for applying updates in the future as well. Take your 
example of Highway 2 for instance. If we upload it as it is in Geobase 
(which works out to be a primary highway I believe), but then you 
correct it to be what you think it is (for instance, highway=motorway), 
then the update script comes along and sets it back to primary, theres 
the potential for a bit of disagreement to take place. Theres a lot of 
other cases I've already seen where the tagging will be modified, in 
some cases its where the tagging guidelines are tagging for the 
renderer, or just where the OSM classifications don't fit the real world 
(take part of Highway 40 north from Highway 3 for instance, I believe 
Geobase has it as a Expressway / Highway, however in reality its a 2 
lane gravel road winding through the mountains that in OSM is also 
tagged as a highway=primary).

-Jason Reid




More information about the Talk-ca mailing list