[Talk-us] Abbreviating names in tools
poppele at hm.edu
Sat Oct 6 20:57:37 BST 2012
Toby Murray wrote:
> On Thu, Oct 4, 2012 at 1:26 PM, andrzej zaborowski<balrogg at gmail.com> wrote:
>> there's been a lot of talk at one point about abbreviating names in
>> the OSM database vs. doing it when processing the data at consumers
>> end. Since mapnik now supports alternative label placements I gave
>> rendering automatically abbreviated names a try. This resulted in a
>> (so far) tiny C library (https://github.com/balrog-kun/shrtnms) that
>> abbreviates the names of map features that you give it. It's very
>> rough but it already handles a couple of the main corner cases. It's
>> just a start on collecting the list of all the abbreviations
>> applicable in all the map languages. Currently only has basic lists
>> for Polish, Spanish and English, where the rules don't differ so much.
>> As an example German is more tricky.
>> I see two ways to use it for rendering:
>> * inside mapnik stylesheets, perhaps by calling the C function from
>> the SQL queries. This would require adding the postgres bindings,
>> something I've never done.
>> * inside osm2pgsql so that the abbreviated names are stored in table columns.
>> As a quick hack I went for the latter option. It gets tricky with
>> hstore and multilingual maps but it works as a first attempt. I have
>> a patch at http://osm.trail.pl/osm2pgsql/0002-Generate-short-name-columns.patch
>> that makes necessary (small) changes and a snapshot of the library
>> code, to osm2pgsql code.
>> It ignores the actual language of a name and just tries to apply all
>> the possible abbreviations from its list. This will eventually need a
>> solution perhaps based on the location of a given map object and a
>> fixed list of country polygons with their main languages. Perhaps it
>> can use a common solution with highway shields rendering.
>> One of the stylesheets at osm.trail.pl currently uses this code (only
>> Europe imported at this time, but I CC'd talk-us because there was a
>> long thread about this at one time). You can see that at z16 here
>> "Norbroke Street" is spelt in full:
>> and z15 shows Norbroke St when there's not enough space. (That
>> stylesheet would also merge the two segments of Norbroke St and only
>> show the name once, had that street not had a gap there.)
>> How to apply
>> In addition to the osm2pgsql patch you also need to add the
>> auto-generated tags short_name and shortest_name to your osm2pgsql
>> .style file so that they end up in the mapnik db. Then in the mapnik
>> 2 stylesheet where you use a<TextSymbolizer
>> foo=bar>[name]</TextSymbolizer>, you need to change it to:
>> <TextSymbolizer foo=bar
>> If there's a tag short_name or shortest_name in OSM data, they'll
>> override the autogenerated versions. The difference between those two
>> columns is in the degree to which they try to shorten the name, for
>> instance for name=West Fulton Street, the new tags will be:
>> short_name=W Fulton St
>> For Polish if a street is named after a person, the person's
>> first/second names will be first shortened to their initials and then,
>> if necessary, omitted as is normally seen in cartography. For Spanish
>> all articles and prepositions are also stripped. Unfortunately if the
>> same rules are applied to US city names that often come from Spanish
>> (Los Angeles) the same thing will happend (resulting in "Angeles"
>> which makes no sense). I also notice that different spanish
>> abbreviations are used in Spain (Avenida -> Avda.) than in Latin
>> America (Avenida -> Av.), so it's something to have in mind if running
>> a global tile server or other service.
>> Suggestions and lists of missing phrases are welcome, perhaps through
>> github (but I'm away from my main machine this week). If this is
>> something that users want to see in map tiles then that list will be
>> needed at some point even though the code doesn't yet handle all the
>> nuances correctly.
> Very nice work. I'm not sure if this will be of help to you or not but
> as part of writing my ogr2osm translation for TIGER 2011 I converted
> the TIGER technical documentation appendix E (Feature Name Types) from
> PDF to a CSV file for easier processing. TIGER probably uses some
> abbreviations that people may not always expect but it is a handy list
> to have at least.
> Talk-us mailing list
> Talk-us at openstreetmap.org
there is a typo in line 262 of file shorten.c:
More information about the Talk-us