[OSM-dev] OSM API Performance (Was: OSM Database Infrastructure)

Sat Jan 19 23:41:07 GMT 2008

On Tue, Jan 15, 2008 at 12:53:08AM +0000, Tom Hughes wrote:
> I'm not aware of any significant performance issues with the API at
> the moment anyway, but maybe you've seen something I haven't?

This isn't particularly relevant to the rest of this thread, but to
answer this question, I say "Yes, absolutely." and I don't think that it
can be solved by a faster database.

I'm working on creating interfaces that display OSM data as you drag the
map around. The interface requests data from OSM "on the fly" -- so
every time the map stops moving, the old features are removed, and 
a new request is sent out. 

For small areas -- on the scale of 4-8 z16 tiles, or up to approximately
.15 degrees in each direction -- this works to a certain extent.

However, beyond that, things tend to be slow enough to be
non-interactive. 5-6 seconds seems typical, 9-10 seconds isn't unusual,
and occasionally I'll get a response in 2-3 seconds.

Note that all of these times are *before* the download starts -- once it
starts, it takes practically no time at all (as I would expect, since I
think all the work is done before th e download starts). 

The output files I'm working with are in the range of 300k-400k
(uncompressed). 

I don't know what makes this process slow. By comparison, taking a
similarly sized area from osm2pgsql via FeatureServer's PostGIS interface 
is typically only approximately 1s lag time. The wild variation in times
-- which doesn't seem to be tied to things being in local cache, as one
3s request can be followed immediately after by a 9s request for the
exact same area -- makes me wonder if perhaps this is not tied to the
data being requested, but the 'backup' on the rails daemons... in which
case, possibly just throwing more hardware at the problem would fix it?

In the past, I think that we had talked about the fact that the /map
request is resource intensive compared to most of the other ruby calls,
and that some of it is memory + CPU intensive and might be a small chunk
worth improving performance in. Given that sometimes the requests are
returning reasonably quickly, this may be premature optimization if al
lthe wait time is waiting for an available rails daemon.

How many rails daemons are there for the non-t at h API? Is there a way to
see how long the queue to get to a rails daemon is? I guess even more
importantly than either of those, is it possible that this explanation
is the case? 

It's naturally true that if there is a limit to the number of clients
talking to the db at once, there is a peak utilization which the
database can't get past. However, it's possible that even with a
completely capable database setup (which I agree exists now), there are
other things limiting performance of the site and development of tools
that could exist if the API was faster (which then leads to a vicious
cycle where performance goes down again... :)) 

Interested to hear any feedback that anyone can offer on this.

Regards,
-- 
Christopher Schmidt
MetaCarta