[Openstreetmap] Re: Server Performance

Mon Jan 2 12:14:50 GMT 2006

Hmm, this mail was meant to be a short reply, but I've kept coming back 
to it over the last day or so and gone off at a couple of tangents; I've 
tried to break it down into digestible chunks.

SteveC wrote:
> Server load, believe me, is my number one thing to fix right now. It's
> swapping and can't really handle the roughly 50,000 hits a day it gets
> (see the stats wiki page).

Ouch - one request every 1.6 seconds or so, assuming even distribution 
(if only!). How does the ram usage break down? Is there anything in 
particular causing the ram hit? I presume tile generation and/or caching 
- for both slippy map and applet - is a big part of it, particularly 
given the "fetching unnecessary tiles" behaviour described recently.

Pane size, tile size:

Currently, tiles are 256x128px, and the viewing pane is 700x500px (to 
fit on an 800x600 screen?), which works out to 2.73x3.90 tiles; we 
always fetch 4x5 tiles (=1024x640px), though sometimes (around 33% of 
the time) this means we unnecessarily fetch a row or column or both. If 
we changed the pane to 768x512px (3x4 tiles), we would gain around 12% 
screen real-estate, still guarantee coverage with 4x5 tiles, and improve 
utilisation of the tiles fetched from 53% to 60%; the number of cases 
where tiles are fetched unnecessarily becomes vanishingly small (around 
  1% of cases), reducing the subjective "I'm waiting for it to finish 
doing stuff I can't see" factor, and, as a bonus, reducing requests 
generated by further scrolling (intuitively, a user who wants to see 
something that's currently 10px outside the pane is likely to drag by 50 
or 100px to give a "comfort zone", with a high likelihood of generating 
tile requests - but if what he wants is already within the pane with 
little or no margin, he won't scroll at all).

If it's not feasible to change the viewing pane size, should we consider 
changing the tile size to something that fits the viewing pane better, 
even if it's not in powers of two? Or is this likely to cause us massive 
pain elsewhere?

Supertiles:

Would it be feasible for the server to combine requests for contiguous 
tiles into a super-tile operation (perhaps on receiving an "I'm about to 
ask for these 20 tiles" notification from the client), then splitting 
that into the requested tiles and putting these into the cache ready for 
the "proper" request?

This would significantly reduce (by a factor of 20, assuming no caching) 
the number of node/segment select operations to generate a given group 
of tiles, while the total number of rows returned would reduce slightly, 
because any segments that cross a tile boundary will be returned once 
instead of twice - as an added bonus, reducing (but not eliminating) the 
artifacts we currently see where segments are shown in tiles containing 
their end nodes, but go AWOL in intermediate tiles.

False negatives in segment selection:

Presumably, this happens because the current tile generation mechanism 
selects only nodes within the tile (something like SELECT node WHERE 
node.x >= tile.minx AND node.x <= tile.maxx AND node.y >= tile.miny AND 
node.y <= tile.maxy), and then selects all segments referencing one or 
more of those nodes, not allowing for segments that have both ends 
outside the tile but still intersect it. (I'm conveniently ignoring edge 
effects at +/-180°).

Reducing false negatives:

Intuitively, something like SELECT node WHERE node.x >= tile.minx-slop 
AND node.x <= tile.maxx+slop AND node.y >= tile.miny-slop AND node.y <= 
tile.maxy+slop will reduce the number of false negatives, but this is 
sloppy and will return a (potentially large) number of false positives.

Eliminating false negatives:

The "correct" way of identifying segments that cross our tile would 
presumably be to work out the line equation for each segment, and use 
simultaneous equations to fill in the tile edge values to each line 
equation for each segment, then check to see where the endpoints of the 
segment are in relation to the tile. This would, however, be CPU-bound, 
and gives no possibility of an indexable query.

Eliminating false negatives efficiently:

Taking this approach a little further, a little geometry lets us 
conclude that if both ends of a segment fall beyond the same edge of a 
tile, the tile and segment won't intersect; we could use this, but it 
means our select making twice as many comparisons as it currently does, 
something like (forgive my pseudo-SQL): SELECT segments WHERE NOT ( 
(segment.node1.x < tile.minx AND segment.node2.x < tile.minx) OR 
(segment.node1.x > tile.maxx AND segment.node2.x > tile.maxx) OR 
(segment.node1.y < tile.miny AND segment.node2.x < tile.miny) OR 
(segment.node1.y > tile.maxy AND segment.node2.x > tile.maxy) ).

However, unlike the previous model, this is indexable (indeed, we're 
really only interested in comparing the greater of node[1,2].x with 
minx, so we could store that in advance, halving query complexity at the 
cost of a much messier table, something like SELECT segments WHERE NOT ( 
segment.rightnode.x < tile.minx OR segment.leftnode.x > tile.maxx OR 
segment.topnode.y < tile.miny OR segment.botnode.y > tile.maxy).

This query will result in a few (I think, surprisingly few) false 
positives, but is far more efficient overall than checking every segment 
for intersection with the tile. False positives are less of a concern 
than false negatives in this case.

Reducing server load:

I can see that doubling the complexity of a select on an already 
overloaded server could be an issue; however, if we were to combine this 
with the super-tiling approach, we'd still be roughly an order of 
magnitude better off than we are now, though our code complexity would 
have increased a bit.

Other ramblings:

How does RAM and CPU usage on the server break down between map viewing, 
editing, wiki, trac, mail, other?

How much space does bat have for more RAM, and would this be a good 
target for a funding drive?

I have a dual Xeon with 1G of ram and around 80G of disk free, running 
Ubuntu Hoary, that's currently doing very little other than generating 
heat, but it's at the wrong end of a very asymmetric (2M/128kbit) DSL 
link. Is theres any processing, compilation, development work, whatever, 
that could usefully be offloaded to that?

Apologies again for the length of this mail!

Cheers,

Christian / cjb

http://www.cjb.ie/