[OSM-talk] US City Import?
Beej Jorgensen
beej at beej.us
Fri Nov 23 22:58:05 GMT 2007
Hi, all!
On a whim, I wrote a python script that takes US Census data and USGS
GNIS place name data, and produces an OSM file of points with hamlets,
villages, towns, and cities.
But there are a few issues (besides the fact that someone else is
probably doing this).
1. Many of the cities are named like this: "Frobozztown (historical)".
I just drop these, but they could be included if people wanted them
(appropriately tagged, somehow).
2. The Census data seems to only list some of the GNIS "populated
places"...notably, piles of the smaller ones are missing. I can either
include them as "hamlet"s, or drop them. (In Calfornia, there are 6300
GNIS populated places, and there is Census population data for 520 of
them, presumablythe largest (but I haven't tested this to be
certain--but it looks like it, visually.))
3. Some cities are in the OSM database already. Data duplication, blah
blah.
Data sources:
http://geonames.usgs.gov/domestic/download_data.htm
http://www.census.gov/popest/cities/SUB-EST2006-4.html
I need to test more, but I've put a couple OSM files up here you can
pull into JOSM... one of all the places, and one of the census places.
http://beej.us/osm/cityosm/
Here's a sample node:
<node id="-3" action="modify" visible="true" lat="41.4226498"
lon="-122.3861269">
<tag k="name" v="Weed" />
<tag k="place" v="village" />
<tag k="ele" v="1044" />
<tag k="import_uuid" v="bb7269ee-502a-5391-8056-e3ce0e66489c" />
<tag k="gnis:id" v="1652650" />
<tag k="gnis:Class" v="Populated Place" />
<tag k="gnis:ST_alpha" v="CA" />
<tag k="gnis:ST_num" v="06" />
<tag k="gnis:County" v="Siskiyou" />
<tag k="gnis:County_num" v="093" />
<tag k="census:data_avail" v="yes" />
<tag k="census:date" v="July 1, 2006" />
<tag k="census:population" v="3040" />
</node>
Comments? Should I have more or less or different tagging information?
Should I do this thing? :)
The Census data only has a few hundred cities per state--an import would
be trivial. Importing _all_ the populated places for a state would be
less trivial (since there are an order of magnitude more of them), but
only in the time-consumption department.
-Beej
More information about the talk
mailing list