[OSM-dev] Reducing osm2pgsql memory usage using a database method
Artem Pavlenko
artem at mapnik.org
Sat Mar 10 14:56:01 GMT 2007
Jon,
> I've just uploaded an experimental version of osm2pgsql which uses
> Postgresql database tables for the transient node and segment storage.
> This drops the memory usage from >1GB to ~60MB. On the downside, the
> import time has gone up from 20 to 100 minutes. I'm sure this can be
> improved though with some more database Mojo.
>
> For further details see SVN (utils/osm2pgsql/experimental/readme.txt)
> or
> http://trac.openstreetmap.org/browser/utils/osm2pgsql/experimental/
> readme.txt
>
Good stuff.
I'm working on new osm.xml and I have some ideas on how to improve
osm2psql output:
1. We can re-write osm2pgsql in c++ and take advantage of dynamic
structures e.g std::map, safe formatting and casting
boost::lexiacal_cast, boost::format and more.
2. At the moment there are a lot of redundant data in output tables.
Everything apart from geometries are dumped as 'TEXT' .
We can have a more flexible design where table structure, attribute
values are configurable (at compile time). Consider this for example:
To render highway features in correct order I want to have z_order
field in planet_osm_table calculated as follow:
int z_order ( osm_feature const& feat)
{
int layer = 0; //default
try
{
layer = boost::lexical_cast<int>(feat['layer']);
}
catch (boost::bad_lexical_cast & )
{
// layer tag has got lots of junk!!!
}
int highway_z = 0; // 0..9
std::string highway = feat['highway']
if ( highway == 'motorway' || highway == 'motorway_link')
{
highway_z = 9;
}
else if (...) {}
....
bool bridge = false;
try {
bridge = boost::lexical_cast<bool>(feat['bridge']);
catch (...) {}
return 10 * ( layer + bridge?1:0) + highway_z ;
}
Also I want to have consistent numeric feature_type calculated
differently depending on tags/values. This will make rendering more
efficient and will bring some (needed) sanity to styles in osm.xml.
3. Also we can abstract 'output writing' to have multiple back-ends :
mysql, sqlite , shapefiles etc .
What do you think?
Artem.
PS. I'm running mapnik , postgresql , osm2psql on mac os x now and I
get substantial performance improvements mainly from faster disk i/o.
I wonder if we should look into finding optimal filesystem for
postgresql. I'm using ext3 on linux.
>
> Jon
>
>
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>
More information about the dev
mailing list