kakrueger at gmail.com
Thu May 9 21:02:00 UTC 2013
In response to the previous thread, I have cleaned up the initial proof of
concept patch, expanded it somewhat and I have now committed it into trunk
As mentioned before, this allows to filter and transform (sets of) tags on
nodes, ways and relations before continuing processing in osm2pgsql.
Hopefully this can be used for a variety of things, like normalising
tagging, or more sophisticated relation processing.
By default it still uses the old tag processing pipeline, but using the
command line switch --tag-transform-script /path/to/tagprocessing_script.lua
should enable the new more flexible pathway. At the moment it only works in
the rendering output, but it shouldn't be too difficult to extend it to the
gazetteer output if needed / wished.
The lua script you pass into osm2pgsql needs to implement 4 functions
The first three each take a set of tags as a lua key-value table and return
a transformed (or unchanged) set of tags back. They also return a flag to
say if the entity(way/node/relation) should be filtered out and not added to
the database (they will still end up in the slim mode tables, but not in the
rendering tables). The filter_tags_way function furthermore returns a flag
if the way should be treated as a line or as a polygon.
The function filter_tags_relation_member is a bit more complex and allows to
deal with more advanced relation tagging, such as multi-polygons that take
can take their tags from the member ways.
This function therefore takes the set of tags from the relation, as well as
the set of tags and role for each of the member ways (member relations and
nodes get ignored). It then returns a transformed (and combined) set of tags
to be applied to the relation in later processing. It furthermore returns a
couple of additional information. First of all, one can specify for each
member way, if it has already been dealt with, or needs to (potentially)
have its own entry. E.g. outer ways in multi-polygon relations are
superseded by the multi-polygon geometry. Tagged inner ways on the other
hand still need to be processed as separate entries. Secondly, one can
specify if the relation should be processed as a line, a polygon, or both
(e.g. administrative boundaries). Thirdly, one can again specify to discard
the entity from further processing.
There is a sample tag transform lua script in the repository as an example,
which (nearly) replicates current processing and can hopefully be used as a
template for ones own scripts.
Performance wise, the lua tag transform is slower than the C based one,
however, it is probably not as bad as I feared. I haven't done an extensive
performance analysis yet though and it will likely heavily depend on the
complexity of the tag transform script.
Everything should hopefully work, but as with any new feature and
particularly as I don't have a immediate use case for it myself, there is a
non negligible chance that there are bugs in the committed code, so treat it
with caution to begin with. As to support this a fair amount of refactoring
was necessary, there is also a possibility that the default c-based tag
transform got new bugs as well. I don't have the resources to test it on a
full planet import, but on all of my test extracts, the non lua code
produced the same database as before the commit. So hopefully things are OK.
For production systems, you should stick with the tagged 0.82.0 version of
osm2pgsql. But I would very much appreciate any feedback on the new lua
Kai Krueger wrote
> In response to Richard's suggestion, I hacked up a proof of concept
> implementation of a tag filter in lua last week.
> Currently it allows you to write a filter function (one for nodes, one for
> ways and one for relations) in lua that takes in a set of tags and returns
> a (potentially) transformed set of tags and a boolean flag, if the object
> should be processed further. For ways, it also determines if it should be
> treated as a line or polygon. This should allow for things like
> normalising the data from "oneway=yes/1/true" or rewriting things like
> "highway=footway" into "highway=path, foot=designated", which could
> hopefully simplify this processing out of the mapnik stylesheets.
> The current proof of concept implementation only works for the rendering
> output so far, not the gazeteer output.
> The question now is what to do with this? Is there enough interest for it
> to be worth cleaning it up and committing it? Should it replace the
> current C filters, or be an additional option? How much performance hit is
> acceptable for it to become the default option? Should it not be committed
> in this form and rather worked on a more generic solution to include the
> gazeteer backend?
> Are there other things that would be good to be able to script outside of
> the source code with lua? How does it interact with the current styling of
> the columns in the osm2pgsql schema?
> For anyone interested, they can find the current patch at
> Richard Fairhurst wrote
>> One of the many wondrous things about OSRM is that you handle the speed
>> impact of different tags (e.g. highway=motorway vs highway=unclassified)
>> with plugins written in Lua, a fast but easy-to-understand scripting
>> Wouldn't it be great to have the same capability in osm2pgsql?
>> Think: path rendering. Right now, you have to potentially weigh up
>> highway=, access=, bicycle/foot/horse/etc.=, designation, and surface=
>> tags. That's a whole bunch of Mapnik rules (or whatever) - slow to
>> write, slow to run. Remapping the tags on database import with osm2pgsql
>> would fix this.
>> Adding this to osm2pgsql is way beyond my poor brane, I'm afraid, though
>> I'd love to do it. But it would make a great GSoC project:
>> or maybe someone might feel inspired to just code it. ;)
>> (Thanks to lonvia and Gnonthgol in #osm-dev for suggestions leading to
>> dev mailing list
View this message in context: http://gis.19327.n5.nabble.com/lua-osm2pgsql-tp5757509p5760383.html
Sent from the Developer Discussion mailing list archive at Nabble.com.
More information about the dev