[Tile-serving] [osm2pgsql-dev/osm2pgsql] Linking between nodes and ways (Discussion #2455)
Jochen Topf
notifications at github.com
Sat Apr 4 14:46:00 UTC 2026
I did some experiments over the last month with different kinds of queries joining "nodes" with "ways" in various way in the database. Either through ids using a "nodes" array, special join tables connecting node_ids with way_ids or using the spatial indexes. This in various combinations of using the middle tables or creating special tables for this use case and so on. All the options were quite slow. Interestingly enough, the spatial queries worked somewhat better than going through ids.
So I believe we currently have no good way of handling this use case, at least for databases that are updated often. Running joins like this on a database that needs half a day to import and then lives for a long time isn't a huge problem, because it doesn't add that much extra processing time, but if we want minutely updates, the current situation is clearly not great.
So I thought about what kind of functionality we could add to osm2pgsql to help with this use case. Here is the idea:
* Add an optional in-memory "id cache" to node tables. This keeps track of all nodes added to a table in the `process_node()` function.
* In the `process_way()` function you can check this cache quickly and do extra processing based on this information.
* In append mode the cache is initialized after nodes are imported with the content of the database table.
Here is an example to make this clearer:
* You have all node barriers like gates etc. stored in a "barriers" table with the id cache.
* When processing ways that have a "highway" tag, you check all member nodes of the highway whether they are in the cache, if yes, it means you have a barrier on the road and you add them to a special "barriers_in_highways" table which includes at least the node_id and the way_id, but can include more information such as the geometry of the node and/or way, highway type, etc. You can not add the "barrier" type at this point because you don't have that information.
* After osm2pgsql runs you can use osm2pgsql-gen (or just run an sql script) to add extra information to the "barriers_in_highways" table. For instance to add the barrier type you just need a field that's NULL and that you fill with the corresponding information from the "barriers" table. Once it is set, it isn't NULL any more, so you don't have to do this over and over.
In this example it can happen that a barrier node is in more than one highway way. At render time you can choose, for instance, only to render the barrier on the highest road class or something like that. That kind of processing is cheap, because there is no join involved any more and you only have very few entries in your bounding box in the barriers_on_highways table.
I have some proof-of-concept code that makes all of this happen and it looks reasonable to me so far. This solution has the advantage that it doesn't add much memory overhead (as long as the table cached isn't too large) and is reasonably easy to understand (much easier than two-stage processing) and works okay with updates. But it will not work with really large lists of nodes, I wouldn't want to do this for all addresses or so. We'll have to see in practice how far we can stretch this.
Does this sound like a reasonable approach? Can you see any problems with this? Does it address the use cases you have?
--
Reply to this email directly or view it on GitHub:
https://github.com/osm2pgsql-dev/osm2pgsql/discussions/2455#discussioncomment-16447971
You are receiving this because you are subscribed to this thread.
Message ID: <osm2pgsql-dev/osm2pgsql/repo-discussions/2455/comments/16447971 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tile-serving/attachments/20260404/727ae89d/attachment.htm>
More information about the Tile-serving
mailing list