[OSM-dev] Server Timeouts - JOSM and the forthcoming big uploads via the API

Mon Aug 13 09:43:01 BST 2007

In message <!&!AAAAAAAAAAAuAAAAAAAAAOKaD4mR3JBOrEpRon92nMgBANp/H2q5kHFIvKMsnZiQaZAAAAABxJAAABAAAABPdDGpgpNOS6dsAo952QZhAQAAAAA=@blueyonder.co.uk>
        Andy Robinson <Andy_J_Robinson at blueyonder.co.uk> wrote:

> While uploading TIGER data last week it became apparent that when the API is
> very quiet (like I am just about the only person uploading data) then using
> JOSM the server times out as regularly as clockwork at around 250 node
> uploads. At busier times the upload might be slower but you may get past the
> 250 point because someone else suffers the timeout on the specific call
> rather than you.
>
> From the discussion it appears the original memory leak process shutdown was
> removed when the rails port went live so therefore the timeouts are perhaps
> limits on the query queue to stop it getting into problems?
>
> It would be useful to have this issue confirmed/properly documented so that
> solutions can be discussed. Its very painful in JOSM having to attend major
> uploads and I'm specifically thinking about any users wishing to use JOSM
> for TIGER county imports. Upload of all the AND data soon will also run into
> a significant delay each time the timeout response is awaited and the upload
> restarted, whether manually or automatically.

I can confirm that there is absolutely nothing on the server that I am
aware of that happens every 250 requests.

I believe that currently each of three daemons that process API queries
will restart after handling 10000 requests.

> A couple of things that were mentioned at the meet were perhaps running API
> requests through a proxy at the client end or running squid at the server
> end, however these were only brief comments since the exact reason is was
> not understood by those discussing.

I have no idea what you think running squid as a reverse proxy at the
server end is going to achieve. Very little of our content is cacheable
and essentially no API requests are cacheable.

> All this leads on to what impact a lot of AND and TIGER data is going to do
> to the platform. We know from last time around that the TIGER data swamped
> everything very quickly if left to run even at a 1 second insertion cycle
> and hence it was generally running at a 3 second insertion cycle. This time
> we have also the AND data to consider and also the fact that once the TIGER
> import starts its likely to be done by lots of JOSM uploaders rather than a
> single import script.

I was talking to one of the Dutch guys about the AND data at the AGM and
they would like to load the data directly on the server rather than doing
it with API calls.

That seemed sensible to me - my only preference was that it be done
through the rails object model rather than via direct SQL queries.

Maybe what we need is a ruby program to load an OSM file and insert
it into the database via the rails object model?

Tom

-- 
Tom Hughes (tom at compton.nu)
http://www.compton.nu/