[OSM-talk] OpenStreetMap Pledge

Lars Aronsson lars at aronsson.se
Thu Jun 1 02:58:53 BST 2006


Tom Carden wrote:

> The reason openstreetmap.org is slower is because it takes all the
> changes into account and shows the current data (which Nick's code
> doesn't do, and which your Mapserver implementation couldn't and
> wouldn't need to do).  I believe it will be possible in the future to
> optimise the database so that queries for "most recent" (which will
> grow faster than writes) become a lot faster, but I've not studied the
> database set-up in detail yet.

There can be something to learn from Wikipedia here.  Their 
internal representation of editing history has changed over time.  
Initially there was no need to care or worry, because there was so 
little data.  That has changed, and there are now 1.1 million 
articles (in the English version) and many of them have several 
thousand previous versions, resulting from daily updates and edit 
wars.  The last ideas are implemented in the database schema for 
version 1.5 of the Mediawiki software.

The most recent dump of the English Wikipedia is from May 18,
http://download.wikimedia.org/enwiki/20060518/

All pages, current version only, takes 2.0 GB compressed.

All pages, all versions included, takes 33 GB compressed.

Current versions of encyclopedic articles only, not including user 
pages or discussion pages, only takes 1.3 GB compressed.

Compared to this, OSM is still very, very small.  But it's also 
very, very young and still stumbling on its own shoes as it tries 
to walk.

> Is this like, "I have discovered a way to render OpenStreetMap 
> data quickly and correctly using Mapserver, but the margin is 
> too small to write it here"?

Tom, we have too much of this negative attitude already.  Here's 
someone who knows MapServer better than the rest of us together. 
Let's encourage him.  He wants to help.  It may be true that 
traditional GIS systems (Postgres GIS + MapServer) don't deal with 
editing history in the wiki fashion needed in OSM, but there can 
be solutions to that.


-- 
  Lars Aronsson (lars at aronsson.se)
  Aronsson Datateknik - http://aronsson.se




More information about the talk mailing list