[Openstreetmap] Wiki down?

Lars Aronsson lars at aronsson.se
Fri Jan 6 19:15:27 GMT 2006


David Sheldon wrote:
> On Fri, Jan 06, 2006 at 08:27:43AM -0800, Jo Walsh wrote:
> > Have you considered something like "sharding", like the MMORPGs have
> > to do when they get overfull? And synchronise the shards every N
> > hours. I wonder how complex this could wind up. A hard one, how much
> > complexity is necessary complexity? :)
> 
> Just a thought, and this might be what you are refering to as sharding,
> but could we redirect based on the lattitude/longditude, and
> serve/store all the American (continent) points/map segments in the US,
> and the European points/map segments in the UK.

It is way too early in the project for anything like this.  
Slicing the application over multiple servers will increase 
complexity tremendously, and we don't want to spend our time 
designing complex replication.  Instead we need to keep complexity 
down, so the project can grow at constant cost, both in the amount 
of data and in the number of users (editors and viewers).

My guess is that OSM currently never sees more than 10 
simultaneous map editors.  In fact I believe that I'm often alone 
with the server when I'm editing maps, and it's still slow. To get 
useful maps, we need a lot of map data, and that will require many 
times more contributors than we have now.  I would say the 
successful project needs a system that can handle a few thousand 
simultaneous editors.  Think Wikipedia scale.

If OSM currently experiences an increase from 10 to 20 users, and 
if this would require the slicing over 2 servers, then the 
increase from 1000 to 2000 users would require 200 servers, which 
simply is unrealistic.  Who would pay for that?  Not me!

If HTTP requests (for map tiles or whatever) arrive at the server 
at a rate of X per second, then the server had better spend less 
than 1/X second on each request, so that it is free to receive and 
process the next request as it arrives.  The response time must be 
pressed down.  This should be our focus.  Trim away the fat.

I think that the green Landsat images should be served separately 
from the white lines or yellow dots.  The green images can then be 
cached indefinitely or served from another server or turned off. 
White lines and yellow dots should be cached as little as 
possible, so that updates are reflected with minimum delay. White 
lines can be cached (two days or so) for non-logged in users, but 
active (logged in) map editors need to see updates immediately.

I also guess that things would become a little faster with larger 
tiles, but this could be marginal.

But most of the slowness right now probably comes from some 
trivial inefficiency (string buffer copying? XML libraries?  ODBC 
bandwidth?) that can become 10 or 100 times faster with the right 
analysis.

The trick now is to set a target for the HTTP response time, say 
0.1 seconds, and let the server software issue a log message 
whenever a request takes much longer, say .3 seconds.  Zero 
tolerance on slow requests is the way to go.  Not only do slow 
requests annoy the user, but they also block the server (which we, 
for all practical purposes, can assume is single-threaded), so it 
cannot handle other requests.


-- 
  Lars Aronsson (lars at aronsson.se)
  Aronsson Datateknik - http://aronsson.se




More information about the talk mailing list