[Talk-ca] Merging osm + geobase status
William Lachance
wrlach at gmail.com
Thu Apr 9 18:36:40 BST 2009
Hey all,
I haven't had as much time to work on this as I thought I would, but I
thought I'd give a quick update on the status of this project anyway,
since I have made some progress in the few hours I've spent on it.
I figured that the initial step to do a comparison between the OSM and
geobase datasets would be to split the OSM data along all junctions (so
that it's representing geometry in roughly the same way as the geobase
one), then see how much deviation there is between the endpoints of
nodes in both sets (by trying to find the way which corresponds most
closely with the geobase one). So that's what I did. You can find the
script which does the comparison between two OpenStreetMap files
attached below.
The results so far have been pretty interesting. I've been doing a small
test with my neighbourhood in Halifax (the north end) which has very
good crowdsourced coverage. It looks like there's a match between the
geobase ways and the osm ones in at least a majority of cases, which
indicates to me that I'm at least doing something right. :)
I've become convinced that actually sorting out which bits of the data
should be kept is basically an AI-complete problem. That is, some human
intervention will be required to figure out the best strategy for
merging the two data sets. I believe I should be able to create a fairly
decent UI for this using the PyGTK and PyCairo libraries.
I do understand that this might result in inconsistent results depending
on the quality of the GeoBase data and the human assisting with the
import process, but the nice thing about OpenStreetMap data is that we
can take an iterative approach. We can revisit the same data many times
using the script as it improves in quality.
My first project will be to create a user assisted way of merging the
geobase tags with the OSM ones (e.g. using the geobase name of the
street if it's better, getting the lane count) in the case where there's
a match between the data.
After that, I'm going to experiment with creating a UI to assist with
grafting in new geobase information into the map, as we discussed on the
list earlier.
While I'm working on this, I'll be sending sample results to the list,
so people can comment on my strategy and approach.
--
William Lachance <wrlach at gmail.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compare.py
Type: text/x-python
Size: 6009 bytes
Desc: not available
URL: <http://lists.openstreetmap.org/pipermail/talk-ca/attachments/20090409/7b5b0277/attachment.py>
More information about the Talk-ca
mailing list