[Talk-us] Proposed automated motorway_link mass edit

Frederik Ramm frederik at remote.org
Fri Jul 17 10:24:54 BST 2009


Dave,

Dave Hansen wrote:
> Can you share some of the scripts and methods you used for this?  

I can but I had somewhat hoped to keep the ugly bits under the carpet. I 
very much followed the ages-old software development method of "meddling 
through" ;-)

So:

1. Create US extract from current planet file using Osmosis and a proper 
US polygon file.

2. Use "osmcut" (C program, from SVN) to split that US extract into nice 
quadratic chunks (of 1x1 degree in my case) to make them easier to 
handle. We're doing a local analysis so this is no problem. The program 
writes unsorted output so sort that again using Osmosis. (If one already 
had smaller excerpts, e.g. something downloaded from the API or cut out 
of the planet, that could be used as well.)

3. Run a Perl script on the individual chunks that loads the ways 
section and does all the magic motorway_link analysis. The output of the 
Perl script is a primitive text file that contains lines like

change way 1234 from motorway_link to residential

I'll make the Perl script available for download when it works properly.

4. For the web report, run another quick Perl script that greps the way 
IDs out of those output files, downloads them from the API (writes one 
file for each way), and outputs them in the proper County/State category 
(lazy boy that I am, I simply take the county info from the first node 
in the way).

5. For the automated edit, again grep the change commands from step 3, 
modify the ways from step 4 accordingly, and update with the API. I'll 
make a separate mini script for this.

Note that while I do have a "Fixbot" framework that I sometimes use to 
automatically fix things, this was too special a case and so I decided 
to handle it differently.

> I do think that some automated fix-bots would be nice to have run
> periodically for cleanups like this that are pretty easy to verify. 

I would advise against running this periodically but it's your choice of 
course. Once the initial bulk is fixed, the small amount of errors that 
may be introduced by people can also be fixed by people. And who knows, 
maybe someone actually *wants* a bit of motorway_link to connect two 
primary roads for whatever highly specific reason...

> I'd also like to look into some gluing back together of the TIGER counties
> and I wonder how suitable this would be.

My gut feeling is that this calls for a semi-automated process in which 
a script suggests certain changes but humans still have to confirm them 
individually. This will be technically possible when third-party web 
applications can make API changes in the name of others by using OAuth 
(soon to be deployed on osm.org). Not something you'll write over a 
weekend that's for sure!

Bye
Frederik




More information about the Talk-us mailing list