[OSM-dev] osm2pgsql tile expiry freaks me out

Frederik Ramm frederik at remote.org
Thu Nov 5 01:23:41 GMT 2009


Hi,

Jon Burgess wrote:
> I never quite managed to get around to using the osm2pgsql based expiry
> code on the main tile server. I still use the ruby scripts written by
> Matt

I might have to recourse to them as well.

> I have discussed the expiry scripts with Matt a couple of times and he
> found that the osm2pgsql based approach tended to hit the DB quite hard.

I haven't measured this thoroughly, but on the machine where I do 
updates every 15 minutes, the updates normally took around 150 seconds, 
and with expiry switched on it's more like 250 seconds or so.

> He found that even though the osm2pgsql code should in theory produce
> more accurate results, the ruby scripts tended to work better overall. 

Matt's scripts don't do relations and this is probably the reason why 
they work well. My experiments have shown the following (for changes 
covering a three-hour interval):

...

      68440 | psql_out_relation (boundary) for 53134 expires 68440 tiles
      68440 | psql_out_relation (boundary) for 53136 expires 68440 tiles
      68440 | psql_out_relation for 53134 expires 68440 tiles
      68440 | psql_out_relation for 53136 expires 68440 tiles
      72248 | psql_out_relation for 45757 expires 72248 tiles
      81951 | psql_out_relation for 44882 expires 81951 tiles
      95256 | psql_out_relation for 276835 expires 95256 tiles
      95256 | psql_out_way (poly) for 23947173 expires 95256 tiles
      96010 | psql_out_relation for 7400 expires 96010 tiles
      96160 | psql_out_relation for 8648 expires 96160 tiles
     106239 | psql_out_relation for 44879 expires 106239 tiles
     132068 | psql_out_relation for 310887 expires 132068 tiles
     132068 | psql_out_relation for 310887 expires 132068 tiles
     161242 | psql_out_relation for 52411 expires 161242 tiles
     221445 | psql_out_relation for 62440 expires 221445 tiles
     228092 | psql_out_relation for 62417 expires 228092 tiles
     269451 | psql_out_relation (boundary) for 47667 expires 269451 tiles
     269451 | psql_out_relation (boundary) for 47667 expires 269451 tiles
     417795 | psql_out_relation (boundary) for 53134 expires 417795 tiles
     432478 | psql_out_way (poly) for 35421140 expires 432478 tiles
     830396 | psql_out_relation (boundary) for 47654 expires 830396 tiles
     830396 | psql_out_relation (boundary) for 47654 expires 830396 tiles
     881066 | psql_out_relation (boundary) for 44882 expires 881066 tiles
     998215 | psql_out_relation (boundary) for 45756 expires 998215 tiles
    1305480 | psql_out_relation for 73347 expires 1305480 tiles
    1680708 | psql_out_relation for 73340 expires 1680708 tiles
    2019020 | psql_out_relation (boundary) for 7400 expires 2019020 tiles
    2287500 | psql_out_relation (boundary) for 45757 expires 2287500 tiles
    2510272 | psql_out_relation (boundary) for 8648 expires 2510272 tiles
    4353265 | psql_out_relation (boundary) for 44879 expires 4353265 tiles
    6872596 | psql_out_relation (boundary) for 52411 expires 6872596 tiles


So there are relations, especially boundary relations, where a little 
change to the relation expires a couple million level-18 tiles. (The 
largest way, #35421140, a riverbank, expires half a million.)

I suspect that at least the large results for relations are due to an 
inefficiency; probably the whole circumference of the relation is marked 
dirty if a little bit changes here or there, something which would not 
be necessary. (In theory, of course, a rendering rule depending on the 
polygon area could flick over and render a whole country pink instead of 
gray just because its area has changed minimally...)

Also, if the geometry of a way changes (and not its tags), then I could 
probably compare the new geometry to the old one and expire only where 
they differ - at least if expiring the whole length of the way means 
half a million tiles or so.

But as for tagging changes, we're quickly getting into terrain where 
expiry and render rules intermingle; if someone changes the "source" tag 
on a very large polygon way, do I really need to expire half a million 
tiles? But what if the same way's landuse tag is changed? It is probably 
a bug or an inefficiency that we have such a high number of expired 
tiles at the moment, but even with perfectly functioning software of 
course it would be possible that e.g. a large boundary gets a new 
admin_level or so and expiry of a very large number of tiles is actually 
required...

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"




More information about the dev mailing list