[Openstreetmap] Re: Server Performance
Christian van den Bosch
cjb at cjb.ie
Mon Jan 2 12:14:50 GMT 2006
Hmm, this mail was meant to be a short reply, but I've kept coming back
to it over the last day or so and gone off at a couple of tangents; I've
tried to break it down into digestible chunks.
SteveC wrote:
> Server load, believe me, is my number one thing to fix right now. It's
> swapping and can't really handle the roughly 50,000 hits a day it gets
> (see the stats wiki page).
Ouch - one request every 1.6 seconds or so, assuming even distribution
(if only!). How does the ram usage break down? Is there anything in
particular causing the ram hit? I presume tile generation and/or caching
- for both slippy map and applet - is a big part of it, particularly
given the "fetching unnecessary tiles" behaviour described recently.
Pane size, tile size:
Currently, tiles are 256x128px, and the viewing pane is 700x500px (to
fit on an 800x600 screen?), which works out to 2.73x3.90 tiles; we
always fetch 4x5 tiles (=1024x640px), though sometimes (around 33% of
the time) this means we unnecessarily fetch a row or column or both. If
we changed the pane to 768x512px (3x4 tiles), we would gain around 12%
screen real-estate, still guarantee coverage with 4x5 tiles, and improve
utilisation of the tiles fetched from 53% to 60%; the number of cases
where tiles are fetched unnecessarily becomes vanishingly small (around
1% of cases), reducing the subjective "I'm waiting for it to finish
doing stuff I can't see" factor, and, as a bonus, reducing requests
generated by further scrolling (intuitively, a user who wants to see
something that's currently 10px outside the pane is likely to drag by 50
or 100px to give a "comfort zone", with a high likelihood of generating
tile requests - but if what he wants is already within the pane with
little or no margin, he won't scroll at all).
If it's not feasible to change the viewing pane size, should we consider
changing the tile size to something that fits the viewing pane better,
even if it's not in powers of two? Or is this likely to cause us massive
pain elsewhere?
Supertiles:
Would it be feasible for the server to combine requests for contiguous
tiles into a super-tile operation (perhaps on receiving an "I'm about to
ask for these 20 tiles" notification from the client), then splitting
that into the requested tiles and putting these into the cache ready for
the "proper" request?
This would significantly reduce (by a factor of 20, assuming no caching)
the number of node/segment select operations to generate a given group
of tiles, while the total number of rows returned would reduce slightly,
because any segments that cross a tile boundary will be returned once
instead of twice - as an added bonus, reducing (but not eliminating) the
artifacts we currently see where segments are shown in tiles containing
their end nodes, but go AWOL in intermediate tiles.
False negatives in segment selection:
Presumably, this happens because the current tile generation mechanism
selects only nodes within the tile (something like SELECT node WHERE
node.x >= tile.minx AND node.x <= tile.maxx AND node.y >= tile.miny AND
node.y <= tile.maxy), and then selects all segments referencing one or
more of those nodes, not allowing for segments that have both ends
outside the tile but still intersect it. (I'm conveniently ignoring edge
effects at +/-180°).
Reducing false negatives:
Intuitively, something like SELECT node WHERE node.x >= tile.minx-slop
AND node.x <= tile.maxx+slop AND node.y >= tile.miny-slop AND node.y <=
tile.maxy+slop will reduce the number of false negatives, but this is
sloppy and will return a (potentially large) number of false positives.
Eliminating false negatives:
The "correct" way of identifying segments that cross our tile would
presumably be to work out the line equation for each segment, and use
simultaneous equations to fill in the tile edge values to each line
equation for each segment, then check to see where the endpoints of the
segment are in relation to the tile. This would, however, be CPU-bound,
and gives no possibility of an indexable query.
Eliminating false negatives efficiently:
Taking this approach a little further, a little geometry lets us
conclude that if both ends of a segment fall beyond the same edge of a
tile, the tile and segment won't intersect; we could use this, but it
means our select making twice as many comparisons as it currently does,
something like (forgive my pseudo-SQL): SELECT segments WHERE NOT (
(segment.node1.x < tile.minx AND segment.node2.x < tile.minx) OR
(segment.node1.x > tile.maxx AND segment.node2.x > tile.maxx) OR
(segment.node1.y < tile.miny AND segment.node2.x < tile.miny) OR
(segment.node1.y > tile.maxy AND segment.node2.x > tile.maxy) ).
However, unlike the previous model, this is indexable (indeed, we're
really only interested in comparing the greater of node[1,2].x with
minx, so we could store that in advance, halving query complexity at the
cost of a much messier table, something like SELECT segments WHERE NOT (
segment.rightnode.x < tile.minx OR segment.leftnode.x > tile.maxx OR
segment.topnode.y < tile.miny OR segment.botnode.y > tile.maxy).
This query will result in a few (I think, surprisingly few) false
positives, but is far more efficient overall than checking every segment
for intersection with the tile. False positives are less of a concern
than false negatives in this case.
Reducing server load:
I can see that doubling the complexity of a select on an already
overloaded server could be an issue; however, if we were to combine this
with the super-tiling approach, we'd still be roughly an order of
magnitude better off than we are now, though our code complexity would
have increased a bit.
Other ramblings:
How does RAM and CPU usage on the server break down between map viewing,
editing, wiki, trac, mail, other?
How much space does bat have for more RAM, and would this be a good
target for a funding drive?
I have a dual Xeon with 1G of ram and around 80G of disk free, running
Ubuntu Hoary, that's currently doing very little other than generating
heat, but it's at the wrong end of a very asymmetric (2M/128kbit) DSL
link. Is theres any processing, compilation, development work, whatever,
that could usefully be offloaded to that?
Apologies again for the length of this mail!
Cheers,
Christian / cjb
http://www.cjb.ie/
More information about the talk
mailing list