[Talk-us] Abbreviating names in tools

Toby Murray toby.murray at gmail.com
Thu Oct 4 20:47:09 BST 2012

On Thu, Oct 4, 2012 at 1:26 PM, andrzej zaborowski <balrogg at gmail.com> wrote:
> Hi,
> there's been a lot of talk at one point about abbreviating names in
> the OSM database vs. doing it when processing the data at consumers
> end.  Since mapnik now supports alternative label placements I gave
> rendering automatically abbreviated names a try.  This resulted in a
> (so far) tiny C library (https://github.com/balrog-kun/shrtnms) that
> abbreviates the names of map features that you give it.  It's very
> rough but it already handles a couple of the main corner cases.  It's
> just a start on collecting the list of all the abbreviations
> applicable in all the map languages.  Currently only has basic lists
> for Polish, Spanish and English, where the rules don't differ so much.
>  As an example German is more tricky.
> I see two ways to use it for rendering:
> * inside mapnik stylesheets, perhaps by calling the C function from
> the SQL queries.  This would require adding the postgres bindings,
> something I've never done.
> * inside osm2pgsql so that the abbreviated names are stored in table columns.
> As a quick hack I went for the latter option.  It gets tricky with
> hstore and multilingual maps but it works as a first attempt.  I have
> a patch at http://osm.trail.pl/osm2pgsql/0002-Generate-short-name-columns.patch
> that makes necessary (small) changes and a snapshot of the library
> code, to osm2pgsql code.
> It ignores the actual language of a name and just tries to apply all
> the possible abbreviations from its list.  This will eventually need a
> solution perhaps based on the location of a given map object and a
> fixed list of country polygons with their main languages.  Perhaps it
> can use a common solution with highway shields rendering.
> One of the stylesheets at osm.trail.pl currently uses this code (only
> Europe imported at this time, but I CC'd talk-us because there was a
> long thread about this at one time).  You can see that at z16 here
> "Norbroke Street" is spelt in full:
> http://a.osm.trail.pl/osmapa.pl/16/32723/21790.png
> and z15 shows Norbroke St when there's not enough space.  (That
> stylesheet would also merge the two segments of Norbroke St and only
> show the name once, had that street not had a gap there.)
> http://b.osm.trail.pl/osmapa.pl/15/16361/10895.png
> How to apply
> In addition to the osm2pgsql patch you also need to add the
> auto-generated tags short_name and shortest_name to your osm2pgsql
> .style file so that they end up in the mapnik db.  Then in the mapnik
> 2 stylesheet where you use a <TextSymbolizer
> foo=bar>[name]</TextSymbolizer>, you need to change it to:
> <TextSymbolizer foo=bar
> placement-type="list">[name]<Placement>[short_name]</Placement><Placement>[shortest_name]</Placement></TextSymbolizer>
> If there's a tag short_name or shortest_name in OSM data, they'll
> override the autogenerated versions.  The difference between those two
> columns is in the degree to which they try to shorten the name, for
> instance for name=West Fulton Street, the new tags will be:
> short_name=W Fulton St
> shortest_name=Fulton
> For Polish if a street is named after a person, the person's
> first/second names will be first shortened to their initials and then,
> if necessary, omitted as is normally seen in cartography.  For Spanish
> all articles and prepositions are also stripped.  Unfortunately if the
> same rules are applied to US city names that often come from Spanish
> (Los Angeles) the same thing will happend (resulting in "Angeles"
> which makes no sense).  I also notice that different spanish
> abbreviations are used in Spain (Avenida -> Avda.) than in Latin
> America (Avenida -> Av.), so it's something to have in mind if running
> a global tile server or other service.
> Suggestions and lists of missing phrases are welcome, perhaps through
> github (but I'm away from my main machine this week).  If this is
> something that users want to see in map tiles then that list will be
> needed at some point even though the code doesn't yet handle all the
> nuances correctly.

Very nice work. I'm not sure if this will be of help to you or not but
as part of writing my ogr2osm translation for TIGER 2011 I converted
the TIGER technical documentation appendix E (Feature Name Types) from
PDF to a CSV file for easier processing. TIGER probably uses some
abbreviations that people may not always expect but it is a handy list
to have at least.



More information about the Talk-us mailing list