[Tile-serving] [osm2pgsql-dev/osm2pgsql] parallelize the COPY phase (Discussion #2426)
Jochen Topf
notifications at github.com
Wed Oct 29 07:46:59 UTC 2025
In the usual configuration there are two threads doing COPYs, one for the "middle" tables (in slim mode only), one for the output tables. Data is collected in chunks and then send via a queue to those threads for the actual COPY operation. We could use a thread pool instead of those two threads for the actual COPY but never thought that this would improve the situation much. In the end the bottle neck is probably the I/O isn't it? And doing more of this in parallel means more contention on the WAL and, if we are writing to the same table in multiple COPYs at once, more contention an that table. So it is unclear to me why having more parallelismus would help significantly. Doing anything with multithreading in C++ code is always a pain, so keeping this code as simple as possible is also important.
But maybe we are wrong there and didn't take some issue into account. And if somebody wanted to try this, that would be great, we'd gat actual data.
--
Reply to this email directly or view it on GitHub:
https://github.com/osm2pgsql-dev/osm2pgsql/discussions/2426#discussioncomment-14812699
You are receiving this because you are subscribed to this thread.
Message ID: <osm2pgsql-dev/osm2pgsql/repo-discussions/2426/comments/14812699 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tile-serving/attachments/20251029/8e5b327a/attachment.htm>
More information about the Tile-serving
mailing list