[Talk-us] Tidying up TIGER data

Dave Hansen dave at sr71.net
Thu Jun 4 18:50:18 BST 2009


On Thu, 2009-06-04 at 00:36 -0600, Ted Percival wrote:
> Its functions are:
> - Strip "St" suffix from grid-named streets (eg. "South 500 West")
> - Collapse multiple spaces into a single space (lots of TIGER)
> - Expand abbreviated directions (eg. "S 500 E" to "South 500 East")
> - Expand abbreviated suffixes ("Rd" -> "Road", "St" -> "Street", etc)

So, I looked at doing this when I originally converted the TIGER data.
The issue is that I'm too dumb to come up with anything that worked
universally across the entire country.

This kind of script is useful for small areas that you've looked at
manually, but please don't apply it too widely.  It does the right
actions for sanely-named things, but TIGER is full of goofy stuff.

Consider: "St. Helens St.".  There are also plenty of semi-mistakes or
weird abbreviations in TIGER that appear to be mistakes.  I wouldn't be
surprised to see "Saint Street" entered somewhere as

	name: "St."
	type: "St."

We don't want to make that "Street Street".  That makes it even
worse. :)

Again, these can work in limited areas where the naming is nice and
consistent, but it's really really hard to make it work on a large scale
where things are *NOT* consistent.

-- Dave





More information about the Talk-us mailing list