[OSM-talk] On how to store geospatial data !?;

Martin Spott Martin.Spott at mgras.net
Sun Jan 14 18:57:28 GMT 2007


Nick Hill wrote:

> It appears you have had experience at storing and retrieving large amounts of 
> geodata in databases. If that is the case, I am interested in your opinion and 
> experience in storing and retrieving data. The types of storage formats end 
> engines you have used.

Current size of the 'raw' database of the mentioned repository is about
10 GByte. I expect the size to quadruple when we're going to import the
whole TIGER road network.

Actually I didn't try to invent my own storage formats at all.
I simply decided to use what is considered as 'standard' in the
OpenSource GIS world and let those people tune the software who are
familiar with it, who have a long time background and therefore
simply know better  :-)
The contents of the current repository is still almost a one-man show
and I'm simply unable to do everything myself ....

> Do you typically store polygons as an ordered list of nodes or as a
> geo data type?

Without exception everything that is meant to represent a geometry is
actually stored using the geometry data type. Otherwise I wouldn't be
able to run these nice geospatial queries and I would risk compatibilty
with the common GIS world.

> Have you had experience of the type of database load like osm, where many 
> asynchronous queries for semi-random areas need to be served?

No, I don't have numbers to compare with the OSM database. If you
specify the load on the OSM database in a portable manner then I'd try
to retrieve comparable numbers.

Indeed I did some tests when I started setting up the MapServer - back
in these days on a machine that's equipped with a 440 MHz CPU and 256
MByte RAM. I came to the conclusion that approx. 95 to 98 % of the time
that it takes to display a MapServer page is required for rendering the
bitmap on the server and for displaying the result in the browser.

A typical MapServer image from 'my' repository is of 1x1 degree in size
and even on this terribly slow machine (that runs other services as
well) the delay for raw data retrieval of concurrent queries was
neglectible - so I decided not to jump into further investigations  :-)

Cheers,
	Martin.
-- 
 Unix _IS_ user friendly - it's just selective about who its friends are !
--------------------------------------------------------------------------




More information about the talk mailing list