[Talk-us] Fixing TIGER street name abbreviations
Dale Puch
dale.puch at gmail.com
Sat May 12 07:16:29 BST 2012
Lots of weird ones from Florida Many should not give you an issue due to
how your processing, but it is best to test them anyhow. Also it might be
a good reference when looking at other expansions after this runs.
way id="10761946" "name" v="E 10th Ct E"
way id="10763539" "name" v="E 10th St E"
way id="10759486" "name" v="E 14th Pl E"
way id="11018453" "name" v="E 1st Avenue Pl" <-- not really a problem,
just... odd
way id="10763214" "name" v="E 40th Pz E" <-- Note the double space
before E
way id="10966845" "name" v="E Camp N Comfort Ln" <-- Non directional N
way id="11210989" "name" v="E Canal St N"
way id="10967404" "name" v="E Dr"
way id="10974755" "name" v="E Dr Martin Luther King Jr Blvd"
way id="11278916" "name" v="E H St E"
way id="10965707" "name" v="E Ln"
way id="11242732" "name" v="E Martin Luther King Jr Dr"
way id="11102139" "name" v="E Pl"
way id="10959109" "name" v="E St Andrews Dr"
way id="10827576" "name" v="E St James Loop" <-- I guess Tiger did not
abbreviate loop
way id="11272826" "name" v="E St Johns St"
way id="11021472" "name" v="E St Louis Ave"
way id="11065801" "name" v="E W Reeves Rd"
way id="103599461" "name" v="E. Watson Road" <-- Not a tiger import
way id="10983188" "name" v="East North Street" <-- already expanded tiger
way id="11270447" "name" v="East North St" <-- not expanded
way id="11274418" "name" v="Edwin St N E"
way id="10851149" "name" v="Egret's Walk Cir S" <-- In case the
's causes problems
way id="10808177" "name" v="Ellesmere E"
way id="10951424" "name" v="Ave del Ctr"
way id="11288799" "name" v="Avenue E N"
way id="10939680" "name" v="Avenue N"
way id="11285084" "name" v="Avenue N NW"
way id="11097378" "name" v="Dr"
way id="10812824" "name" v="Dr Faruqui Dr"
way id="11358527" "name" v="Dr Joe Abal Dr"
way id="10919692" "name" v="Dr Martin L King Jr Dr"
way id="11128816" "name" v="N 14th St Pl"
way id="10982651" "name" v="N 19th Cir SW"
way id="39488514" "name" v="N 22nd St." <-- non tiger
way id="10885972" "name" v="N 3rd Street Cir"
way id="10993673" "name" v="N Blvd"
way id="10807124" "name" v="N Cortez Dr Cir C"
way id="11371860" "name_1" v="N Cswy" <-- "name" v="N Causway"
way id="11090351" "name" v="N E 144th Avenue Rd"
way id="11080981" "name" v="N E 238 Ave Rd"
way id="11089629" "name" v="N E 62nd Ct Rd"
way id="10927659" "name" v="N E St"
way id="11013343" "name" v="N F S 595-2"
way id="10925619" "name" v="N N St"
way id="11359562" "name" v="N N Road"
way id="10921209" "name" v="N S St"
way id="10880720" "name" v="N St Andrews St"
way id="10765917" "name" v="N St Clair St"
way id="10979914" "name" v="N St Peter St"
way id="11302478" "name" v="N Swan Ct NE"
way id="10243562" "name" v="N W 34th St R"
way id="11092219" "name" v="N W 51 St Ct"
way id="10927760" "name" v="N W Ave F North"
way id="10763701" "name" v="N de Gama Ave N"
way id="26630760" "name" v="N orth22nd Street" <--bad manual edit
way id="27354570" "name" v="N orthGarcia Avenue" <--bad manual edit
way id="10754189" "name" v="N-Yellow Pine Cir" <-- "name_1" v="Yellow
Pine North Cir"
way id="119723334" "name" v="N. Shingle Lane" <-- non tiger
way id="10983026" "name" v="N19th Ave" <-- "tiger:name_base" v="111th"
Probably due to edits
way id="11058140" "name" v="NE 40 Ln" <-- "name_1" v="NE 1 St Ave"
Version 1 tiger
way id="10806770" "name" v="NE 16th Ter; NE 17th Ave" <-- double name
possibly from edits
way id="11079312" "name" v="NE 172 Ave Rd"
way id="11089303" "name" v="NE 18th Ave; NE 9th St" <-- double name
possibly from edits
way id="10800930" "name" v="NE 19th Ter; NE 25th St" <-- double name
possibly from edits
way id="11100990" "name" v="NE 196 Ter Rd"
way id="11099492" "name" v="NE 21st Ter W"
way id="11088248" "name" v="NE 220th Ave Rd"
way id="11062349" "name" v="NE 3 Rd Ave"
way id="11081124" "name" v="NE 36th Av Rd"
way id="11070763" "name" v="NE Mt Zion A M E Church Ave"
way id="11081908" "name" v="NE226 Ter"
way id="28931406" "name" v="NE31st Ave" <-- non tiger
way id="10789444" "name" v="NE
way id="10788734" "name" v="NW 10th St Access Rd"
way id="10788581" "name" v="NW 126th Ave; NW 126th Way"
way id="10242655" "name" v="NW 141st"
way id="10242241" "name" v="NW 181 St"
way id="11128828" "name" v="NW 181st St"
way id="11085308" "name" v="NW 21st Street"
way id="11082282" "name" v="NW 221st Street Rd"
way id="10765627" "name" v="NW 231 St"
way id="11151648" "name" v="NW 4th Avenue Cir E"
way id="10792992" "name" v="NW 6th Ave; Blanch Ely Ave"
way id="10809778" "name" v="NW 71st Pl; NW 71st St"
way id="10928777" "name" v="NW Avenue G; Avenue G North; NW Avenue G"
way id="11273744" "name" v="NW Dr"
way id="107757877" "name" v="NW NW 125th Avenue" <-- non tiger
way id="10246730" "name" v="NW30Ln" <-- name1 has spaces
way id="11065133" "name" v="National Forest Rd 141A"
way id="11060010" "name" v="Nf Rd 354"
way id="11083729" "name" v="Nfr 75B"
way id="11034257" "name" v="Nfs 572 B"
way id="10237516" "name" v="Nnw 141 St"
way id="10803531" "name" v="Nmw 49th Ave"
way id="83737572" "name" v="North 46th Streeet" <--manual expansion typo
way id="10874252" "name" v="Northern Pacific Dr N"
way id="11124503" "name" v="Northwest 38th Court; NW 38th Ct"
way id="11213490" "name" v="Norwich O" <-- "tiger:name_direction_suffix"
v="O"
way id="57732753" "name" v="Nw 35th Ave" <-- name case
way id="9059279" "name" v="S St"
way id="11058256" "name" v="S W Cr 347"
way id="11030290" "name" v="S and S Ln"
way id="34939098" "name" v="S.W. Sundance Trail" <-- non tiger
way id="10927892" "name_1" v="SE Ave E" <-- "name" v="Avenue E South"
way id="11234533" "name" v="SE Ave K Pl" <-- "tiger:name_base" v="SE"
"tiger:name_type" v="Ave"
way id="11200345" "name" v="SE Ave F Pl"
way id="11100255" "name" v="SE Summerfield Way; Summerfield Way"
way id="11298967" "name" v="SE W Snow Rd"
way id="10244663" "name" v="SE182 Ave"
way id="11121266" "name" v="SW 108th Stcr" <-- "name_1" v="SW 108th St"
way id="11156351" "name" v="SW 108th Stcr N"
way id="11167562" "name" v="SW 112th Cir Ln S"
way id="11032640" "name" v="SW 28th Ter; SE 28th Ter"
way id="11107394" "name" v="SW Dr Martin L King Jr Dr"
way id="11101573" "name" v="SW St George St"
way id="10767432" "name" v="Saint St SE"
way id="10777261" "name" v="Scallop Dr; George J King Blvd; Glen Cheek Dr"
way id="10762071" "name" v="W 33rd St W"
way id="10763371" "name" v="W 30th St W"
On Fri, May 11, 2012 at 7:38 PM, Serge Wroclawski <emacsen at gmail.com> wrote:
> On Fri, May 11, 2012 at 4:17 PM, Dale Puch <dale.puch at gmail.com> wrote:
> > I understand the script checks for only one instance of the abbreviation.
>
> > My point was what is someone manually expanded ONE of the abbreviations,
> > leaving "st something street"? Is that checked for?
>
> I have a number of thoughts here:
>
> 1. Real world examples.
>
> Many of the examples I've seen are contrived. I'm glad we're testing,
> but testing needs to be based on actual data seen in the US dataset.
>
> That said:
>
> 2. There are a couple of ways to handle this:
>
> * One way (the most conservative way) would be to test for untouched
> TIGER ways. That is ways in which they're still at version 1. This
> would be a real problem, though, since there are lots of examples were
> someone may have fixed the geometry without touching the tags.
>
> * The other way is a method I'm using in an experimental branch of the
> code on my machine, which is to try to be a bit more selective about
> the expansions of road types. If we assume that the road type always
> appears after the base name, we can be handle examples like (real
> world example) "St Marys St". The same would hold true for direction
> tags, so we'd be able to expand "E E St" confidently as well.
>
> But there's a catch. If someone would have edited the name of the
> above street from the original "St Marys St" to "St. Marys St" then
> that test would fail, and the expansion would never occur, where as in
> the current version, it would.
>
> So:
>
> 3. Any method used is going to produce some number of potential either
> false positives or false negatives. I contend that the number of
> errors in either case will be so tiny that it will be lost in the
> noise, but there's no way to promise it will always be 0. The best we
> can do is toss out uncertain expansions and have them handled manually
> (which is something I'm working to make better in the next version of
> the code as well).
>
> But:
>
> 4. I don't want us to rely on cleverness. I'd much rather rely on
> people testing the code with real world inputs and checking the
> outputs.
>
>
> I should have a new version of the code either tonight or tomorrow,
> with the new expansion rules.
>
> - Serge
>
--
Dale Puch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/talk-us/attachments/20120512/c5480205/attachment-0001.html>
More information about the Talk-us
mailing list