[OSM-dev] intermediate spacial grid for postgis joins
sav123
sav123 at h-f.net
Fri Mar 30 19:41:09 UTC 2018
Hello,
It is a cartoo-css query. I don't try to understand it.
Analyzing the join, I deduced that it was difficult to optimize the
'ST_SetSRID' with the '&&'. It depends from the value of the proximity
argument , the units and the size of the required box.
Note that 1) the base tables already have geometry indexes, 2) the query is
trying to rely nodes to ways, something very local since a New York amenity
never intersects a Paris street ( unless if poetry matter ).
Adding the join with the 'local' grid consists only of splitting the surface
in small squares to apply the join inside the squares. It's allmost
equivalent to a set partition. The trick result is so remarkable ( even if
1/15 is far from the theoretical 1/256 ) that I wanted to share it ...
:)
Igael
-----Message d'origine-----
From: Darafei "Komяpa" Praliaskouski
Sent: Friday, March 30, 2018 1:52 PM
To: sav123
Cc: dev at openstreetmap.org
Subject: Re: [OSM-dev] intermediate spacial grid for postgis joins
Hi,
can you please share your query plans?
also, `distinct on` is usually a sign of not well thought logic, a lateral
join with a subquery with limit 1 can be much more performant than producing
extra rows in possibly wrong join order and then getting rid of them.
пт, 30 мар. 2018 г. в 14:46, sav123 <sav123 at h-f.net>:
Hello All,
context : postgres database, cartoocss
I was trying to build the exact table for the cartoo css layer
"turning_circle_casing".
It is a join between the tables line and point at high zoom.
sql from cartoo css authors embedded in a create table by a script :
CREATE TABLE IF NOT EXISTS turning_circle_casing_raw AS select * from
(SELECT DISTINCT ON (p.way)
p.way AS way, l.highway AS int_tc_type,
CASE WHEN l.service IN ('parking_aisle', 'drive-through', 'driveway')
THEN 'INT-minor'::text
ELSE 'INT-normal'::text
END AS int_tc_service,v.prio
FROM planet_osm_point p
JOIN planet_osm_line l ON ST_DWithin(p.way, l.way, 0.1)
JOIN (VALUES
('tertiary', 1),
('unclassified', 2),
('residential', 3),
('living_street', 4),
('service', 5)
) AS v (highway, prio)
ON v.highway=l.highway
WHERE p.highway = 'turning_circle'
OR p.highway = 'turning_loop'
) as creator WHERE ST_IsValid(way) ORDER BY
ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";
1,597,250 selected rows, 28800 seconds to build , 50 seconds to index on
geometry col.
Then I produced an intermediate table to divide by 60 one of the tables :
CREATE TABLE IF NOT EXISTS planet_osm_point_casing AS ( select way from
planet_osm_point where highway in ('turning_circle','turning_loop') ) ;
CREATE INDEX IF NOT EXISTS planet_osm_point_casing_i ON
planet_osm_point_casing USING GIST (way) ;
ANALYZE planet_osm_point_casing;
3 or 4 minutes to get the new table with 1,631,000 record instead of
the original 103,755,000.
The request execution time was reduced to 25000 seconds. Still too much.
Then I built a regular grid table 256 x 256 from -20037508 to 20037508 on x
and on y
insert into local_join_grid values
('BOX3D(xmin ymin,xmax ymax)'::box3d), etc ...
looping on xmin ymin,xmax ymax to get 65536 records.
and I modified the above request like that :
CREATE TABLE IF NOT EXISTS turning_circle_casing_raw AS select * from
(
SELECT DISTINCT ON (p.way)
p.way AS way, l.highway AS int_tc_type,
CASE WHEN l.service IN ('parking_aisle', 'drive-through', 'driveway')
THEN 'INT-minor'::text
ELSE 'INT-normal'::text
END AS int_tc_service,v.prio
FROM local_join_grid b
JOIN planet_osm_point_casing p on p.way && ST_SetSRID(b.bbox, 3857)
JOIN planet_osm_line l ON l.way && ST_SetSRID(b.bbox, 3857) and
l.highway in ('tertiary',
'unclassified','residential','living_street','service') and
ST_DWithin(p.way, l.way, 0.1)
JOIN (VALUES
('tertiary', 1),
('unclassified', 2),
('residential', 3),
('living_street', 4),
('service', 5)
) AS v (highway, prio)
ON v.highway=l.highway
) as creator WHERE ST_IsValid(way) ORDER BY
ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";
and I got the result in 1802 s, index in 43 s. It is 15 times faster.
I suppose that the optimal grid size depends of the hardware and the
postgresql configuration.
The comparison between the 2 results shows some rows missing. It is because
the grid which is not correct. It cuts streets.
I know 1 or 2 heavy heurisitics to build a good grid if it doesn't exist.
I could also start with existent boundaries.
Does the grid exist ? some ideas ?
:)
Igael
_______________________________________________
dev mailing list
dev at openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev
More information about the dev
mailing list