[OSM-dev] An alternative rendering-database setup and update strategy!

Stefan Keller sfkeller at gmail.com
Tue Dec 17 20:33:30 UTC 2013


Hi Sven

Can you explain liitle bit more the setup of the servers you have in mind?
You seem to have a potent machine which has a PostGIS database and uses
osm2pgsql to populate and update it periodically.
Then you want to attach a semi-potent server to that "master"?

Yours, Stefan


2013/11/25 Sven Geggus <lists at fuchsschwanzdomain.de>

> Hello,
>
> on my day-job I recently had to solve the Problem of setting up a Postgis
> database contaning a full-planet extract using one of those cheap 180GB SSD
> on a semi-potent machine (only 8GB of RAM).
>
> I first tried to use osm2pgsql for this purpose which is almost impossible
> for a couple of reasons:
>
> * I would have needed more than 180GB of disk space (at least during
> import)
> * The import would take a _very_ long time (several days rather than hours)
>
> So I had to look for a backup strategy!
>
> I found one which proved to be that good, that I would like to discuss it
> here as an alternative way for rendering-database setup.
>
> The first thing I discovered is the fact, that we currently use roughly
> twice
> the disk-space for intermediate tables (or file in case of flatnode) than
> for the processed data itself.
>
> Here is how this looks like on tile.openstreetmap.de:
>
>          relation          | total_size
> ---------------------------+------------
>  public.planet_osm_line    | 61 GB
>  public.planet_osm_polygon | 59 GB
>  public.planet_osm_point   | 10 GB
>  public.planet_osm_roads   |  8 GB
>
>  public.planet_osm_ways    | 174 GB
>  public.planet_osm_rels    |   4 GB
>
> + flatnode.dat 20 GB
>
> processed data: 61GB+59GB+10GB+8GB=138GB
> intermediate data: 20GB+174GB+4GB=202GB
>
> The most annoying part is the 174GB planet_osm_ways table.
>
> So what I did to solve my problem was using pg_dump for the processed data
> tables only and setting up my target database using pg_restore.
>
> Advantages:
> * Very fast data import even on machines where import using osm2pgsql would
>   be practically impossible
> * Decent size of database dump (32GB in case of a --hstore-match-only
> database
>   which is about the same size as a planet dump)
>
> Disadvantages:
> * Currently no update strategy available
> * Will need a "master" osm2pgsql database
> * Will "inherit" table scheme from "master" database
>
> Conclusion:
>
> IMO the disadvantages can be resolved or at least mitigated in the
> following
> way:
>
> * we generate a downloadable dump of the database on
>   tile.openstreetmap.de (e.g. weekly)
> * osm2pgsql needs to be patched to output the changes to the processed data
>   tables as SQL commands which can then be used to replicated the slave
> databases
> * We already use --hstore-match-only database format. So flexibility in
>   table-layout is not that much of a concern as views to hstore column can
> be used
>   for rendering instead of tables.
>
> I would like to hear your comments to this proposial.
>
> Regards
>
> Sven
>
> --
> Threading is a performance hack.
> (The Art of Unix Programming by Eric S. Raymond)
>
> /me is giggls at ircnet, http://sven.gegg.us/ on the Web
>
>
> --
> "Thinking of using NT for your critical apps?
>                                   Isn't there enough suffering in the
> world?"
>                    (Advertisement of Sun Microsystems in Wall Street
> Journal)
> /me is giggls at ircnet, http://sven.gegg.us/ on the Web
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20131217/465ad0d8/attachment.html>


More information about the dev mailing list