[OSM-talk] Server slowness
Nick Hill
nick at nickhill.co.uk
Mon Jan 15 11:45:42 GMT 2007
I echo Richard's post.
But I would add the rider that the current OSM set-up is by no means optimised.
I estimate organising the data on disc according to geographic location, and
partitioning the database can improve node look-ups by an order of magnitude.
Therefore, a direct comparison should implement partitioning.
Notwithstanding, I fully concur with previous posts that if OSM data model were
shared with with the rest of the open GIS community, many benefits would be
derived.
Everything else being equal, even if adopting a standard GIS model would cause a
slight degradation in performance, the cost would be worthwhile.
The aims of other free and open GIS seem to be designed for problem spaces
different to OSM. Not in terms of data description but in terms of mass
distribution of GIS data. The design is by no means economical. When I tested
the MySQL implementation, A node consists of a 64 byte bounding box. When I
queried areas for nodes, performance was low, I/O was high.
This led me to conclude the data used in the R-tree lookup path is not as
optimised as B-tree. B-tree being much more mature, having many optimisations
which R-tree doesn't have. (At this point, I can imagine many people who know
about the problem domains R-tree and B-tree indexes are supposed to solve
pointing out to me how an R-tree is ore appropriate for geo data. I agree. But
that is not my contention).
I therefore contend:
1) The data types for postgis are uneconomical. I contend that point data types
using 1/4 of the storage can perform adequately, with +/- 5mm global accuracy.
2) R-tree indexes, although theoretically being close to ideal for the geo
problem domain have problems with their implementation. Lookups on R-tree appear
to be much more fragmented than look-ups on b-tree, resulting in lots of costly
disk seeks, or requiring them to be cached in RAM.
(Both above issues are soluble).
I also contend that
3) B-tree lookups on lat/lon are theoretically inefficient. Only one of either
lat/lon are used as an index range lookup. The second is looked up through a
brute force search. However, the look-up on the first column is extremely
efficient. In practice, the records narrowed from the first column needing brute
force search to narrow the second column, are actually performed quickly, with
few additional disk seeks. I don't have an explanation for it's apparent speed
apart from the maturity of b-tree and widespread efforts to counteract the
shortcomings of b-tree with clever optimisations for 2-column arrangements.
Ideally, we need to dispose of that requirement to brute force search without
introducing unacceptable overheads, and the brute force search does impose
scalability concerns.
In summary, the postGIS system appears to have a lot going for it, and feel
there are opportunities being lost with OSM not sharing the same data format
with other free GIS initiatives. At the same time, my tests using MySQL have
shown that many of the theoretical performance benefits of the postGIS system
are just that - theoretical, and genuinely look forward to being proven wrong on
this. I also think that the theory and practice of PostGIS can be brought closer
together with further development and refinement. If OSM used postGIS, that
could help development of postGIS. On the other hand, if OSM used postgis, it
may delay or prevent better systems developed through OSM seeing the light of day.
Richard Fairhurst wrote:
> Quoting Martin Spott <Martin.Spott at mgras.net>:
>
>> You probably should have a closer look at PostGIS, especially at the
>> capabilities regarding geospatial queries, and you're likely to be
>> pleased. PostGIS' strength lies in much more than just serving as data
>> exchange and storage
>
> Most of the OSM data is available (planet.osm), as is all of the
> source. Many OSM developers are busy with other parts of the project
> at the moment, so if you can provide some benchmarks to show that OSM
> really does run faster on a PostGIS setup, and show what changes you
> made to achieve this, we'd be all ears.
>
> cheers
> Richard
>
>
> _______________________________________________
> talk mailing list
> talk at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/talk
>
More information about the talk
mailing list