[OSM-talk] Announcing name searches for OSM

Lars Aronsson lars at aronsson.se
Tue May 8 15:18:49 BST 2007


David Earl wrote:

> I wonder where to stop though.

Initially you should include all "accented" letters in Latin-1 and 
perhaps Latin-2. For every search query that doesn't match 
anything, log that query to file and read the log file after a 
while to see if there are strange letters that weren't matched.  
There are tons of l-slash, s-caron, etc. used throughout eastern 
Europe without going into other scripts (from Cyrillic to 
Chinese).

These are the Latin-1 chars and their substitutes:

ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
AAAAAAACEEEEIIIIDNOOOOOOUUUUYTsaaaaaaaceeeeiiiidnoooooouuuuyty



-- 
  Lars Aronsson (lars at aronsson.se)
  Aronsson Datateknik - http://aronsson.se




More information about the talk mailing list