[Tile-serving] [openstreetmap/osm2pgsql] Confusing middle pgsql processing. (Issue #2049)
Marcos Dione
notifications at github.com
Tue Aug 22 13:54:59 UTC 2023
<!-- Please don't use screenshots. Copy and paste the *text* output here if that's needed for context. -->
## What version of osm2pgsql are you using?
`osm2pgsql version 1.8.0`
## What operating system and PostgreSQL/PostGIS version are you using?
Linux Debian Stable, PG Database version: `15.3 (Debian 15.3-0+deb12u1)`, PostGIS version: `3.3`
## Tell us something about your system
Small server, 8GiB RAM, 1TB consumer SATA SSD.
## What did you do exactly?
`osm2pgsql --verbose --database gis --cache 0 --number-processes 4 --slim --flat-nodes $(pwd)/nodes.cache --hstore --multi-geometry --style $osm_carto/openstreetmap-carto.style --tag-transform-script $osm_carto/openstreetmap-carto.lua europe-latest.osm.pbf`
## What did you expect to happen?
The logs about the middle processing are a little bit confusing:
```
2023-08-19 20:49:52 [1] Clustering table 'planet_osm_polygon' by geometry...
2023-08-19 20:49:52 [2] Clustering table 'planet_osm_line' by geometry...
2023-08-19 20:49:52 [3] Clustering table 'planet_osm_roads' by geometry...
2023-08-19 20:49:52 [4] Clustering table 'planet_osm_point' by geometry...
2023-08-19 20:49:52 [1] Using native order for clustering table 'planet_osm_polygon'
2023-08-19 20:49:52 [2] Using native order for clustering table 'planet_osm_line'
2023-08-19 20:49:52 [3] Using native order for clustering table 'planet_osm_roads'
2023-08-19 20:49:52 [4] Using native order for clustering table 'planet_osm_point'
2023-08-19 22:35:50 [3] Creating geometry index on table 'planet_osm_roads'...
2023-08-19 22:50:47 [3] Creating osm_id index on table 'planet_osm_roads'...
2023-08-19 22:55:52 [3] Analyzing table 'planet_osm_roads'...
2023-08-19 22:57:47 [3] Done task in 7674389ms.
2023-08-19 22:57:47 [3] Starting task...
2023-08-19 22:57:47 [3] Done task in 1ms.
2023-08-19 22:57:47 [3] Starting task...
2023-08-19 22:57:47 [0] Done postprocessing on table 'planet_osm_nodes' in 0s
2023-08-19 22:57:47 [3] Building index on table 'planet_osm_ways'
2023-08-19 23:32:06 [4] Creating geometry index on table 'planet_osm_point'...
2023-08-20 00:13:30 [4] Creating osm_id index on table 'planet_osm_point'...
2023-08-20 00:20:35 [4] Analyzing table 'planet_osm_point'...
2023-08-20 00:20:40 [4] Done task in 12647156ms.
2023-08-20 00:20:40 [4] Starting task...
2023-08-20 00:20:40 [4] Building index on table 'planet_osm_rels'
2023-08-20 02:03:11 [4] Done task in 6151838ms.
2023-08-20 03:17:24 [2] Creating geometry index on table 'planet_osm_line'...
2023-08-20 03:54:40 [2] Creating osm_id index on table 'planet_osm_line'...
2023-08-20 04:02:57 [2] Analyzing table 'planet_osm_line'...
2023-08-20 04:03:01 [2] Done task in 25988218ms.
2023-08-20 05:26:21 [1] Creating geometry index on table 'planet_osm_polygon'...
2023-08-20 06:17:31 [1] Creating osm_id index on table 'planet_osm_polygon'...
2023-08-20 06:30:46 [1] Analyzing table 'planet_osm_polygon'...
2023-08-20 06:30:47 [1] Done task in 34854542ms.
2023-08-20 10:48:18 [3] Done task in 42630605ms.
2023-08-20 10:48:18 [0] Done postprocessing on table 'planet_osm_ways' in 42630s (11h 50m 30s)
2023-08-20 10:48:18 [0] Done postprocessing on table 'planet_osm_rels' in 6151s (1h 42m 31s)
```
I separated it in sections by worker. The messages about `Starting task` and `Done task` don't mention which task it is. Also, do I understand correctly that not all workers started working at the same time? If I'm right, why is that? If I Isolate worker 3's logs:
```
2023-08-19 20:49:52 [3] Clustering table 'planet_osm_roads' by geometry...
2023-08-19 20:49:52 [3] Using native order for clustering table 'planet_osm_roads'
2023-08-19 22:35:50 [3] Creating geometry index on table 'planet_osm_roads'...
2023-08-19 22:50:47 [3] Creating osm_id index on table 'planet_osm_roads'...
2023-08-19 22:55:52 [3] Analyzing table 'planet_osm_roads'...
2023-08-19 22:57:47 [3] Done task in 7674389ms.
2023-08-19 22:57:47 [3] Starting task...
2023-08-19 22:57:47 [3] Done task in 1ms.
2023-08-19 22:57:47 [3] Starting task...
2023-08-19 22:57:47 [3] Building index on table 'planet_osm_ways'
2023-08-20 10:48:18 [3] Done task in 42630605ms.
2023-08-20 10:48:18 [0] Done postprocessing on table 'planet_osm_ways' in 42630s (11h 50m 30s)
```
At 20:49 starts processing `roads`. ~2h later it's creating 2 indexes and analyzing it. Launches 2 tasks, but finishes one first (22:57), a long one? Which one? And then another, but one seems to still be running.
And suddenly it's creating indexes on `ways`. This seems to finish the next day and it's properly logged.
I also see the messages about 'Done postprocessing' for `nodes` (22:57), `ways` and `rels` (next day, 10:48), but not for `roads`. I __do__ get all 4 `All postprocessing' messages.
Could tasks be more verbose about what exactly they're doing? I didn't try `--log-level=debug` because this is a 49h+ process and I really don't want to start it all over :)
--
Reply to this email directly or view it on GitHub:
https://github.com/openstreetmap/osm2pgsql/issues/2049
You are receiving this because you are subscribed to this thread.
Message ID: <openstreetmap/osm2pgsql/issues/2049 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/tile-serving/attachments/20230822/4a674e51/attachment.htm>
More information about the Tile-serving
mailing list