[OSM-dev] Alternative tile webserver needed?

Mon Apr 30 06:04:18 BST 2007

On 30/04/07, Lars Aronsson <lars at aronsson.se> wrote:
>
> Stefan de Konink wrote:
>
> > I'm sorry but I don't see the relation between complexity and
> > least cost routing. It is just a sign that people are 1) just
> > lucky they don't pay any bills 2) didn't think about scalablity.
> >
> > Isn't it an interesting idea to have different servers for
> > different continents on the *same* location? I still *strongly*
> > believe that splitting the 'higher zoomlevels' to local servers
> > because they are more used locally. It saves a BUNCH of
> > diskspace per server and probably makes the operation less
> > complex per server.
>
> To clarify, I only speak my own opinion here.  I'm just a mapper.
>
> You can design a system split between two computers, still having
> them in the same room. Today OSM has a classic two-tier solution,
> using a "vertical" distribution of labour between one computer as
> web front-end and the other as database back-end.  But you can
> also do the split horizontally, serving the western hemisphere
> from one computer and the eastern hemisphere from the other
> computer.  Or 64 computers, each serving a slice of the world. The
> horizontal split makes the system more complex.  This can
> certainly be managed, but it's often easier and more economic to
> just buy a stronger, single server.

I'm from the multiple servers > single server camp... To me, it's better to
have multiple cheaper/smaller servers in one location than having a single
big server... If your key issue is performance, building a solution to split
the load over more than one server gives you more growth potential, possibly
by adding more boxes, than restricting yourself to a single box that can
become very expensive to increase performance at a certain point and still
leave you with a very real limit on how far you can go... If the solution to
split load over multiple servers involves just mirroring the data/service,
you get the potential for also having a more resilient service. If the
solution to split load over multiple servers involves each server only
taking part of the role of a single server, e.g. part of the tileset, you
still get the performance benefits, you still have the potential for some
data to be available if a box fails, but, not all of it and the per machine
requirements storage etc and overhead of backend processes transferring data
around is lower that just mirroring... If going beyond 2 servers, you can
combine both, e.g. tileset split into 3 chunks with 3 servers, 1 could hold
chunks 1 and 2, 1 could hold 2 and 3 and 1 could hold 3 and 1... That way
you have the potential for distributed load, you don't need each machine to
deal with the entire dataset, you can survive a single server failing etc...

You are right, multiple servers is more complex than a single server, but
there are a lot of potential benefits that can justify that extra
complexity. And for the work I do, a single server is just not sufficient as
it leaves us with a single point of failure, and from that point of view,
geographic separation of servers removes other single points of failure from
a solution too, i.e power, connectivity, environment, building and all the
components of those...

I'll admit that some of the arguments may not apply so easily to this
project, things like the fact that buying a small stack of cheap servers
costs the same as or less than buying one big server doesn't make a
difference when all you have is one cheap server and a very tight budget...

But then, you can compare any of these scenarios with distributing
> the servers to different physical locations.  That was what I
> thought you suggested and I protested against.  This ultimately
> requires the coordination of people taking care of the different
> installations, including synchronized holidays.  And when anything
> fails, they would start to blame each other instead of focusing on
> finding the problem in their own server. What would the benefit
> be?  Shorter ping times?  That's not our problem.  Lower Internet
> bills?  I don't think that's our problem either.  Is it?  The
> discussion was started because the tile server had a full disk.
> That's a problem that can be solved for 100 euro by just buying a
> new disk.  We're probably just waiting for the stores to open.
>
> Any of these exercises can be fun and interesting in their own
> right, but *that* is not the purpose of the OSM project.

As I mentioned above, geographic distribution can be good for resilience,
but, most of the time I've dealt with splitting/mirroring a service across
locations geographically it's been for performance reasons, and that's
typically to bring the service closer to the users reducing latency,
reducing the distance the traffic has to go causing less congestion, allow
the service to be accessed over less lower bandwidth links increasing
responsiveness etc... Yep, it's easier to manage if you own all the
infrastructure and have a single team/group that manages it all... For OSM,
there has already been a discussion about having more people to help with
the system admin, there was a comment about having people in different
timezones would be beneficial, and such a geographically dispersed team that
work together on the OSM infrastructure as a whole, rather than just their
own little bits, assuming that different members of this team also provided
hardware, could work I think...

You say shorter ping times isn't our problem and that you don't think
internet bills is our problem... For a service with a very good hosting set
up in the UK like we have with OSM, across Europe you'll see bloody good
performance and a very low latency, it's true there is stacks of
connectivity and low costs for providers to provide that... but, looking
globally that's not the case... Even from the USA where there's also stacks
of connectivity managed well, you'll be lucky too see ping latency below
100ms, to simplify this massively, assuming the servers are very responsive
and the data transferred is small, for something that does lots of requests,
e.g. JOSM uploads or to a lesser degree use of slippy maps, that'll still be
up to 10 times slower than if you only had a ping latency of 10ms to the
servers... but, on the plus side from the USA, the connectivity is so good
that packet loss will typically be negligible and there will probably be
plenty of bandwidth so that won't restrict things further... But, look
further afield, from India, you'll do well to get below 200ms latency, so
that's 20 times slower, from Australia, you'll be lucky to get below 300ms
latency, so that becomes 30 times slower, connectivity from Australia is
much better these days than it once was... from China, on a good day you'll
be looking at 300 to 600ms latency, probably very variable, with a good deal
of packet loss on routes to US/Europe... without the impact of the packet
loss, that 30 to 60 times slower...
Lower internet bills isn't a problem directly for us I'd guess, but, for
people/business giving hosting international bandwidth probably costs them
extra and we don't want to put additional cost on these guys...for
international providers that due to cost can't achieve the bandwidth
excesses and quality of Europe/US connectivity, we don't deal with them
directly, but, people using them will find the service a lot less usable
because of the worse connectivity...

Anyway... That was a bit long... our little project here has a lot to do
with very finite resources, right now, more disk space as you said will fix
the immediate issue and probably allow some performance improvements by
spreading the load over two disks in one box... But, it's always good to
think ahead, to see what problems could come in the future and to try and
spend a little time understanding that and considering that when doing
things now, yep going out and spending a bucket load of cash, that we don't
have, on lots of servers in lots of datacenters around the world and paying
for people to manage that as their job isn't something that's likely to be
needed any time soon... one day, it could quite possibly come to that... /me
puts away crystal ball...

d
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20070430/c357852b/attachment.html>