[OSM-talk] US City Import?

Beej Jorgensen beej at beej.us
Fri Nov 23 22:58:05 GMT 2007

Hi, all!

On a whim, I wrote a python script that takes US Census data and USGS 
GNIS place name data, and produces an OSM file of points with hamlets, 
villages, towns, and cities.

But there are a few issues (besides the fact that someone else is 
probably doing this).

1. Many of the cities are named like this: "Frobozztown (historical)". 
I just drop these, but they could be included if people wanted them 
(appropriately tagged, somehow).

2. The Census data seems to only list some of the GNIS "populated 
places"...notably, piles of the smaller ones are missing.  I can either 
include them as "hamlet"s, or drop them.  (In Calfornia, there are 6300 
GNIS populated places, and there is Census population data for 520 of 
them, presumablythe largest (but I haven't tested this to be 
certain--but it looks like it, visually.))

3. Some cities are in the OSM database already.  Data duplication,  blah 

Data sources:



I need to test more, but I've put a couple OSM files up here you can 
pull into JOSM... one of all the places, and one of the census places.


Here's a sample node:

<node id="-3" action="modify" visible="true" lat="41.4226498" 
	<tag k="name" v="Weed" />
	<tag k="place" v="village" />
	<tag k="ele" v="1044" />
	<tag k="import_uuid" v="bb7269ee-502a-5391-8056-e3ce0e66489c" />
	<tag k="gnis:id" v="1652650" />
	<tag k="gnis:Class" v="Populated Place" />
	<tag k="gnis:ST_alpha" v="CA" />
	<tag k="gnis:ST_num" v="06" />
	<tag k="gnis:County" v="Siskiyou" />
	<tag k="gnis:County_num" v="093" />
	<tag k="census:data_avail" v="yes" />
	<tag k="census:date" v="July 1, 2006" />
	<tag k="census:population" v="3040" />

Comments?  Should I have more or less or different tagging information?

Should I do this thing?  :)

The Census data only has a few hundred cities per state--an import would 
be trivial.  Importing _all_ the populated places for a state would be 
less trivial (since there are an order of magnitude more of them), but 
only in the time-consumption department.


More information about the talk mailing list