[Talk-us] Abbreviating names in tools

andrzej zaborowski balrogg at gmail.com
Thu Oct 4 19:26:57 BST 2012


Hi,
there's been a lot of talk at one point about abbreviating names in
the OSM database vs. doing it when processing the data at consumers
end.  Since mapnik now supports alternative label placements I gave
rendering automatically abbreviated names a try.  This resulted in a
(so far) tiny C library (https://github.com/balrog-kun/shrtnms) that
abbreviates the names of map features that you give it.  It's very
rough but it already handles a couple of the main corner cases.  It's
just a start on collecting the list of all the abbreviations
applicable in all the map languages.  Currently only has basic lists
for Polish, Spanish and English, where the rules don't differ so much.
 As an example German is more tricky.

I see two ways to use it for rendering:

* inside mapnik stylesheets, perhaps by calling the C function from
the SQL queries.  This would require adding the postgres bindings,
something I've never done.
* inside osm2pgsql so that the abbreviated names are stored in table columns.

As a quick hack I went for the latter option.  It gets tricky with
hstore and multilingual maps but it works as a first attempt.  I have
a patch at http://osm.trail.pl/osm2pgsql/0002-Generate-short-name-columns.patch
that makes necessary (small) changes and a snapshot of the library
code, to osm2pgsql code.

It ignores the actual language of a name and just tries to apply all
the possible abbreviations from its list.  This will eventually need a
solution perhaps based on the location of a given map object and a
fixed list of country polygons with their main languages.  Perhaps it
can use a common solution with highway shields rendering.

One of the stylesheets at osm.trail.pl currently uses this code (only
Europe imported at this time, but I CC'd talk-us because there was a
long thread about this at one time).  You can see that at z16 here
"Norbroke Street" is spelt in full:
http://a.osm.trail.pl/osmapa.pl/16/32723/21790.png

and z15 shows Norbroke St when there's not enough space.  (That
stylesheet would also merge the two segments of Norbroke St and only
show the name once, had that street not had a gap there.)
http://b.osm.trail.pl/osmapa.pl/15/16361/10895.png

How to apply

In addition to the osm2pgsql patch you also need to add the
auto-generated tags short_name and shortest_name to your osm2pgsql
.style file so that they end up in the mapnik db.  Then in the mapnik
2 stylesheet where you use a <TextSymbolizer
foo=bar>[name]</TextSymbolizer>, you need to change it to:

<TextSymbolizer foo=bar
placement-type="list">[name]<Placement>[short_name]</Placement><Placement>[shortest_name]</Placement></TextSymbolizer>

If there's a tag short_name or shortest_name in OSM data, they'll
override the autogenerated versions.  The difference between those two
columns is in the degree to which they try to shorten the name, for
instance for name=West Fulton Street, the new tags will be:

short_name=W Fulton St
shortest_name=Fulton

For Polish if a street is named after a person, the person's
first/second names will be first shortened to their initials and then,
if necessary, omitted as is normally seen in cartography.  For Spanish
all articles and prepositions are also stripped.  Unfortunately if the
same rules are applied to US city names that often come from Spanish
(Los Angeles) the same thing will happend (resulting in "Angeles"
which makes no sense).  I also notice that different spanish
abbreviations are used in Spain (Avenida -> Avda.) than in Latin
America (Avenida -> Av.), so it's something to have in mind if running
a global tile server or other service.

Suggestions and lists of missing phrases are welcome, perhaps through
github (but I'm away from my main machine this week).  If this is
something that users want to see in map tiles then that list will be
needed at some point even though the code doesn't yet handle all the
nuances correctly.

Cheers



More information about the Talk-us mailing list