[OSM-dev] scaling
Steve Singer
ssinger_pg at sympatico.ca
Mon Jan 10 04:07:53 GMT 2011
On Sun, 9 Jan 2011, Frederik Ramm wrote:
> If OSMF board wants to realistically construct a scenario that has the number
> of new users, and the amount of edits and traffic, multiplied by 100 in the
> frame of a few months, then I'm sure that TWG can provide ideas how to get
> there. But you would *have* to find someone who creates that scenario for the
> social side and looks into the issues I've brought up above or else we'll be
> fucked on a level that's much harder to fix than a slow database.
I think Frederik is right in that someone needs to define some scenarios
of growth that the technical solutions can be evaluated against. Different
aspects of OSM might grow at different rates and different scaling solutions
will work under different scenarios.
Some of the questions that these scenarios should try to address
* How might the size of the database grow (# of nodes, ways, relations) over
time. What assumptions in the geographic distribution of the data can we
make. Are there any assumptions with the types/tags of nodes/ways/relations
that can be made.
* The scenarios need to discuss how we expect to see the consuming of OSM
data to grow. OSM provides different access methods (tiles, XAPI,
API,minute diffs, planet dumps, name/tag searches ...) Scaling growth in
read load is an easier problem than scaling write load but we shouldn't
ignore it.
* Some scenarios describing how edits tend to look. What percentage of
edits would only involve changing tags versus adding/changing the location
of nodes. Do we see most edits (by edits I mean a set of changes that get
uploaded in a single request and applied atomically) confined to small
geographic areas
I'm sure there are more questions but before we can start solving the
problem we need to define it.
> (Indeed one way of scaling would be to "devolve" the central database - have
> several of them for different regions. That would make it more difficult for
> people to write bots that wreak havoc with the world-wide dataset but I
> wouldn't consider that a loss ;)
Bots could easily be written edit/wreck havoc all the databases. That's a
different technical/social problem than scaling.
Do we know how much server load is the result of bots/scripts versus
interactive editing applications. I ask this because the
performance/latency requirements of an interactive application are different
than a bot and because there might be technical solutions for improving the
efficiency of handling changes from a bot that won't work for interactive
applications.
> First of all, stop talking of the kind of explosive growth you're talking
> about. Set yourselves a realistic growth scenario - I'd say something like
> "number of contributors & data size double every year" or so. Sit down with
> TWG and find out how long the traditional approach will hold out, and how and
> when to scale, then make plans for that. I'm sure TWG will be mature enough
> to say what they can do and what they can't, and where they will need input &
> concepts from the project at large. Find suitable ways to solicit such.
Again Frederik is right here. It is important to find a solution for the
problems we realistically will have in the next few years and the first step
to define that growth.
>
> Bye
> Frederik
>
> --
> Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
Steve Singer
More information about the dev
mailing list