[Talk-us] [Imports] [Talk-ca] TIGER considered harmful
Anthony
osm at inbox.org
Mon Nov 16 02:55:07 GMT 2009
On Sun, Nov 15, 2009 at 9:22 PM, Peter Batty <peter.batty at gmail.com> wrote:
> When I said "messy", I guess I was thinking of two things - one is doing the
> import, as you mention here (which is sort of where the discussion started).
> This seems quite a bit more complex if you have to split ways and insert
> nodes.
You don't have to split ways if you put the house numbers on the nodes.
> The other is in writing a geocoding engine based on the data which is
> produced. If you have the data all on the way, it is a simple query to find
> one record, and you interpolate along the geometry. I'm not sure how you
> would write an effective geocoding engine directly based on the model with
> nodes
Step 1: Find all the ways within the given city/zip code/county/etc
with the proper street name.
Step 2: Find all nodes referenced by those ways with
"addr:tigernumber:right/left" (or whatever)
Step 3: Find the highest number less than what you're looking for and
the lowest number greater than what you're looking for, with the
proper even/oddness.
Step 4: Interpolate between those two positions along the way(s) found
in step 1 connecting them.
Step 5 (optional): move a few meters to the left or right of the
interpolated position with respect to the way, as appropriate
Basically the same procedure as if the data is on the way.
Of course, if you're doing a lot of geocoding you'll want to build
special indexes which are geared specifically to your use of the data.
The OSM database needs to be normalized, not optimized for specific
tasks like geocoding. Putting the data on the way would be fine in
that regard if it weren't for the need to split ways in the middle of
some ranges. Having three ways separated by 10 meters when you
logically only have one, as proposed by the current TIGER import,
would not be normalized - you've got the same information in three
locations.
> In terms of how to decide what number you use when you split a way, you have
> the same problem in either case
If the house numbers are on the nodes, you just split the way -
there's no need to put a(n artificial) number in the middle.
Likewise with merging two ways. You do realize that if you put the
house numbers on the way, and not on the nodes, that means you'll have
to split some ways literally into hundreds of little pieces, right?
On Sun, Nov 15, 2009 at 9:40 PM, SteveC <steve at asklater.com> wrote:
> I suspect the Karlsruhe schema is a bit like the license change. Everyone thinks they have a better idea
In this case pretty much everyone does. Using the Karlsruhe schema
for TIGER addresses is just silly.
More information about the Talk-us
mailing list