[Talk-GB] OS Locator / OSM correspondence list generation

Tim François sk1ppy14 at yahoo.co.uk
Thu May 13 19:58:27 BST 2010


Robert,

Looks super interesting. I've been trying to do something similar but for local areas, rather than GB as a whole to make is useable for single mapper. See http://wiki.openstreetmap.org/wiki/Bath/OSLocator_Comparison - the method is towards the bottom of the page. It's certainly not as clever as yours, but does a lot of the same things (spelling matching, removal of punctuation, extending abbreviations etc).

As for releasing the data to the rest of the world, I output a kml file of the waypoints, and using OpenLayers plot points over the places where there are name discrepancies. Example: http://osm.tiiiim.com/bath/os_locator/. Certainly not perfect, as for one only I can change this kml file - the table on the wiki page is user editable, but is just boring!

Also, there's this: http://wiki.openstreetmap.org/wiki/User:SK53/OS_OpenData#OS_Locator.

I'd suggest copy/pasting your blog post into the wiki once the code is in a releasable state - I'm excited to see the results!

Tim

--- On Thu, 13/5/10, Robert Scott <lists at humanleg.org.uk> wrote:

From: Robert Scott <lists at humanleg.org.uk>
Subject: [Talk-GB] OS Locator / OSM correspondence list generation
To: talk-gb at openstreetmap.org
Date: Thursday, 13 May, 2010, 17:23

Hi all,

I've been running some countrywide comparisons of the recently released OS Locator against the streets in OSM, using fuzzy string matching and the supplied bounding boxes to attempt to match each street in each dataset to one in the other. It's worked pretty well for most areas I tested. Of the ~826k named streets in OS Locator, about 424k of them have near perfect matches in OSM. A few tens of thousands more have what I would call spelling 'disagreements'. The rest of them have bad or no matches at all.

I've put a description of the technique up here along with the preliminary results:

http://humanleg.org.uk/code/oslmusicalchairs

The thing I really need is suggestions for getting this data to users in a way that's practical to work with. It's a CSV currently.

Thoughts welcome. So are bug reports of where my matching algorithm has gotten things wrong.


robert.

_______________________________________________
Talk-GB mailing list
Talk-GB at openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-gb



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-gb/attachments/20100513/1b1f9734/attachment-0001.html>


More information about the Talk-GB mailing list