[Talk-us] TIGER expansion bot

Alex Barth alex at mapbox.com
Tue Nov 27 19:45:07 GMT 2012


Serge -

Awesome effort, thanks for starting this.

- You are planning to run this script on all TIGER data in OSM, i. e. the entire US, correct?
- Are you envisioning a (geographically) limited first run with and a review phase? If so, how will this look like?
- Will any highway data that has been modified post-import be touched? I'm trying to ask the same question as Brian May did with the '"Main St SW" to "SW Main Street"' example.

On Nov 26, 2012, at 9:29 PM, Serge Wroclawski <emacsen at gmail.com> wrote:

> Hello all,
> 
> After the OSM US Google Hangout two weeks ago, there was talk of
> bringing back the effort I started six months ago to create a TIGER
> expansion bot to run against the roads in the US.
> 
> I've brushed off the code and made several improvements to it (more on
> this later in the mail).
> 
> In order to facilitate community involvement, I've talked with the OSM
> US board and we're going to have a process by which the code is
> officially vetted.
> 
> That process begins with this email. I'm making the most recent
> version of the code available at:
> 
> https://github.com/emacsen/tiger-expansion
> (there was a URL for the previous version, but this is where the
> current, up to date code will live).
> 
> I encourage people to review the code.
> 
> In addition, on Thursday, November 29th at 8pm EST on Google Plus,
> we'll have another public hangout where I'll do a code walkthrough.
> This will be an opportunity for people to bring up questions or
> concerns they have about specific code issues.
> 
> From there, baring any major issues, I'll send a followup email to
> this email where I'll make a final request for comment. This will be
> for specific code issues, and people are encouraged to send in any
> specific code related issues, and we'll have that review period open
> for one week.
> 
> After that, the code will be executed, and that execution period will
> probably be several days, as I'll be manually supervising the
> execution myself.
> 
> In anticipation of the code walkthrough on Thursday, I'll give a high
> level overview of the code, as well as the changes from the version
> six months ago.
> 
> The code is written in Python, and it uses a simple XML parser to
> parse OSM XML. I have a simple framework for handling this in the
> pyxbot.py file, which handles the parsing and selection tprocess.
> 
> The tiger.py file contains TIGER specific expansion code, and the
> selection process is quite simple. The selector looks for ways which
> have a "highway" key and a "name" key present in the tag.
> 
> The selected tags then go through a transformation, which looks for
> name, name_1, name_2, etc and looks for corresponding tiger tags
> (tiger:name_base), etc. It then pieces apart the name from the
> existing name and reconstructs it using the expanded tiger tags. If
> the new name is different, then it is stored.
> 
> If the name is already properly expanded, then the way is ignored but
> if there's a problem with the tag expansion, then that way information
> is stored elsewhere for review.
> 
> The review file (a CSV file) contains information about all the ways
> that didn't process properly, such as the way ID, the (primary) name,
> and the reason for the failure.
> 
> This file can then later be review later, or fed into a MapRoulette
> challenge.[1].
> 
> Now, for those folks who looked at the code six months ago, these are
> the major changes:
> 
> 1. I've expanded the expansion table quite a bit, through extensive testing.
> 
> 2. I've added the review file functionality
> 
> 3. I've added name_1, etc. functionality.
> 
> 4. The code is more modular than it was
> 
> 5. The code is easier to run from the command line
> 
> 
> So, the code is out there. If you have technical questions, I'll go
> into more depth Thursday.
> 
> - Serge
> 
> [1] I'm hacking on the MapRoulette code to make it easier to add new
> challenges, such as this.
> 
> _______________________________________________
> Talk-us mailing list
> Talk-us at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us

Alex Barth
http://twitter.com/lxbarth
tel (+1) 202 250 3633







More information about the Talk-us mailing list