[openstreetmap/openstreetmap-website] Backfill enhanced changeset stats (PR #6347)
Tom Hughes
notifications at github.com
Tue Aug 26 23:24:54 UTC 2025
tomhughes left a comment (openstreetmap/openstreetmap-website#6347)
It would have been better if I'd tested that the results were correct though, which they aren't :-(
You can get rid of the nested grouping but you do still need the union, like this:
```sql
WITH changes AS (
SELECT
nodes.changeset_id,
CASE WHEN nodes.version = 1 THEN 1 ELSE 0 END AS num_created_nodes,
CASE WHEN nodes.version > 1 AND nodes.visible THEN 1 ELSE 0 END AS num_modified_nodes,
CASE WHEN nodes.version > 1 AND NOT nodes.visible THEN 1 ELSE 0 END AS num_deleted_nodes,
0 AS num_created_ways,
0 AS num_modified_ways,
0 AS num_deleted_ways,
0 AS num_created_relations,
0 AS num_modified_relations,
0 AS num_deleted_relations
FROM nodes
WHERE nodes.changeset_id = ANY($1::bigint[])
UNION ALL
SELECT
ways.changeset_id,
0 AS num_created_nodes,
0 AS num_modified_nodes,
0 AS num_deleted_nodes,
CASE WHEN ways.version = 1 THEN 1 ELSE 0 END AS num_created_ways,
CASE WHEN ways.version > 1 AND ways.visible THEN 1 ELSE 0 END AS num_modified_ways,
CASE WHEN ways.version > 1 AND NOT ways.visible THEN 1 ELSE 0 END AS num_deleted_ways,
0 AS num_created_relations,
0 AS num_modified_relations,
0 AS num_deleted_relations
FROM ways
WHERE ways.changeset_id = ANY($1::bigint[])
UNION ALL
SELECT
relations.changeset_id,
0 AS num_created_nodes,
0 AS num_modified_nodes,
0 AS num_deleted_nodes,
0 AS num_created_ways,
0 AS num_modified_ways,
0 AS num_deleted_ways,
CASE WHEN relations.version = 1 THEN 1 ELSE 0 END AS num_created_relations,
CASE WHEN relations.version > 1 AND relations.visible THEN 1 ELSE 0 END AS num_modified_relations,
CASE WHEN relations.version > 1 AND NOT relations.visible THEN 1 ELSE 0 END AS num_deleted_relations
FROM relations
WHERE relations.changeset_id = ANY($1::bigint[])
),
total AS (
SELECT
changes.changeset_id,
SUM(changes.num_created_nodes) AS num_created_nodes,
SUM(changes.num_modified_nodes) AS num_modified_nodes,
SUM(changes.num_deleted_nodes) AS num_deleted_nodes,
SUM(changes.num_created_ways) AS num_created_ways,
SUM(changes.num_modified_ways) AS num_modified_ways,
SUM(changes.num_deleted_ways) AS num_deleted_ways,
SUM(changes.num_created_relations) AS num_created_relations,
SUM(changes.num_modified_relations) AS num_modified_relations,
SUM(changes.num_deleted_relations) AS num_deleted_relations
FROM changes
GROUP BY changes.changeset_id
)
UPDATE changesets
SET num_created_nodes = total.num_created_nodes,
num_modified_nodes = total.num_modified_nodes,
num_deleted_nodes = total.num_deleted_nodes,
num_created_ways = total.num_created_ways,
num_modified_ways = total.num_modified_ways,
num_deleted_ways = total.num_deleted_ways,
num_created_relations = total.num_created_relations,
num_modified_relations = total.num_modified_relations,
num_deleted_relations = total.num_deleted_relations
FROM total
WHERE changesets.id = total.changeset_id
```
It turns out though that there doesn't seem to be much point as a naive version using subqueries is just as fast:
```sql
UPDATE changesets
SET num_created_nodes = (SELECT COUNT(*) FROM nodes WHERE changeset_id = changesets.id AND version = 1),
num_modified_nodes = (SELECT COUNT(*) FROM nodes WHERE changeset_id = changesets.id AND version > 1 AND visible),
num_deleted_nodes = (SELECT COUNT(*) FROM nodes WHERE changeset_id = changesets.id AND version > 1 AND NOT visible),
num_created_ways = (SELECT COUNT(*) FROM ways WHERE changeset_id = changesets.id AND version = 1),
num_modified_ways = (SELECT COUNT(*) FROM ways WHERE changeset_id = changesets.id AND version > 1 AND visible),
num_deleted_ways = (SELECT COUNT(*) FROM ways WHERE changeset_id = changesets.id AND version > 1 AND NOT visible),
num_created_relations = (SELECT COUNT(*) FROM relations WHERE changeset_id = changesets.id AND version = 1),
num_modified_relations = (SELECT COUNT(*) FROM relations WHERE changeset_id = changesets.id AND version > 1 AND visible),
num_deleted_relations = (SELECT COUNT(*) FROM relations WHERE changeset_id = changesets.id AND version > 1 AND NOT visible)
WHERE id = ANY($1::bigint[])
```
Both take about 5s to do the first 10000 changesets when the cache is hot for the indexes.
--
Reply to this email directly or view it on GitHub:
https://github.com/openstreetmap/openstreetmap-website/pull/6347#issuecomment-3226054386
You are receiving this because you are subscribed to this thread.
Message ID: <openstreetmap/openstreetmap-website/pull/6347/c3226054386 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/rails-dev/attachments/20250826/a63d834a/attachment-0001.htm>
More information about the rails-dev
mailing list