[Talk-ca] What do I poutine the name tag of a road with a suffix?

Minh Nguyen minh at nguyen.cincinnati.oh.us
Tue Dec 13 18:16:27 UTC 2022


Vào lúc 08:18 2022-12-13, john whelan đã viết:
> The problem comes with searching for an address, do you enter NW or 
> North West?  Which does the search engine understand?

The exact spelling is less of an issue for geocoders than for other 
kinds of software. Any quarter-decent geocoder would perform lots of 
redundant indexing, case folding, diacritic folding, word stemming, and 
fuzzy matching to maximize the number of results for any given query. If 
a geocoder happens to index an overexpanded street name, the worst that 
could happen is an extra irrelevant result for some queries.

Nominatim has implemented a large number of abbreviations in English, 
including some specific to Canada that would help in the Prairies, such 
as "Rg" and "Subdiv". [1] Even the ancient Namefinder geocoder, which 
predated Nominatim on osm.org, supported finding "north" when entering 
"N". [2]

That said, spelled-out words can help improve the ranking quality in 
several ways. A geocoder might recognize "North" as a directional 
suffix; the user could optionally include it in their query for best 
results but wouldn't be required to. More predictability in OSM means 
more opportunities for the geocoder to associate addresses with streets, 
which is important for sending navigation applications to the front door 
of a destination instead of around back.

> If we are dealing with speech software perhaps the best way is a name:speech tag or even multiple tags with the different possibilities.

name:pronunciation=* exists to clarify names that, through no fault of 
the TTS engine, would be pronounced incorrectly without that additional 
context. For example, a "Reading Street" could sound like either 
"reeding" or "redding". But in order to apply a name:pronunciation=* 
tag, you need to know the International Phonetic Alphabet or look up the 
pronunciation of each word in a dictionary.

Maybe folks would be OK with using it sparingly for some abbreviation 
lookalikes like "Avenue S" (between Avenues R and T) or "Ave Maria 
Drive", but I can't imagine blanketing most of Alberta in IPA and 
expecting mappers to maintain that. At least not before tagging 
"Lieutenant Street" to ensure it gets pronounced /luˈtɛnənt/ in the U.S. 
but /lɛfˈtɛnənt/ in Canada. ;-)

[1] 
<https://github.com/osm-search/Nominatim/blob/4c52777ef03738803845f9ee58d269d93bbb9c3d/settings/icu-rules/variants-en.yaml>
[2] <https://wiki.openstreetmap.org/wiki/Name_finder:Abbreviations#English>

-- 
minh at nguyen.cincinnati.oh.us





More information about the Talk-ca mailing list