[OSM-dev] extracting house-number from string

Udo Giacomozzi udo.osm at nova-sys.net
Sat Mar 14 14:33:15 GMT 2009


Hello Marcus,

Friday, March 13, 2009, 4:28:24 PM, you wrote:
MW> does anyone know a good algorithm to extract
MW> the house-number from a string containing
MW> street-name and house-number?


Why not treat any word that begins with a digit as house number?

Note that street names sometimes contain numbers, although I think
I've only seen them using roman numerals, like "Via IV Novembre" here
in Italy.

Another situation could be a year as part of the street name. I've
never seen such a street name, but it could be possible.

I suggest an algorithm similar to this:

1) any "word" that begins with a digit is defined to be a "number"
   (including "17B", "32/C" and such)

2) if a number is at the beginning or the end of the street name *and*
   is separated using "," then it is the street number
   (eg. "Via Roma, 36C" or "38, Sesame Street")

3) if there is a number ar the beginning or the end without a "," then
   it is still the street number


2 and 3 may seem identical, but 2 has higher precedence, so that a
string like "34, Via Tirol 1809" still gets recognized correctly.

Just a quick idea.. :-)

Udo






More information about the dev mailing list