[OSM-dev] osmtools first release

Sun Apr 27 15:13:10 BST 2008

Hello all

I'm proud to announce the first release of osmtools, a fast embedded
osm-data-database. This is an alpha version, only to show whats
possible and to check where development should go. But it already
features a fcgi server which might be somewhat usable for the export
tab.
>From the readme:

Introduction
------------

Osmtools is an embedded database for osm-data together with some
programs that work on the database. Goal is to provide a easy way to
write fast osm-application, independent of the database size.
The Data-Model is a bit different from normal osm data because ways do
not consist of a ordered list of node-ids, which have to be looked up
in order to work on the data. Instead ways directly consist of a
number of points. This should speed up for example rendering to svg or
routing by a huge amount.
Also overhead is reduced, as the completely redundand id<->point
assignment is removed.
Data is laid out spatially on disk to speed up access to
bounding-boxes. Chaching is left to the OS which works quite well, at
least on small datasets.
The database does not require much memory, the database folder takes
uncompressed a bit more memory than the compressed osm-file.
The only Way to access the Databse is currently an iterator wich
iterates through the data specified by a bounding box and a tag
filter. The filter is a number of key value pairs where key and value
are regular expression as understood by pcre (perl compatible regular
expressions).

current Limitations:
- single threaded (slow import)
- no spatial request (other then bbox)

Complilation
------------
To compile you need cmake and the pcre and fcgi developement files.
In the osmtools folder do

cmake .
make

and you should get the executables osm2db, osm_server and planet_stat.

Applications
------------
- osm2db

  usage: osm2db [options] input output-directory

  where options may be:
  -s key val : strips the key-value pair from all nodes

  osm2db will import the input osm file to a databse which will reside
in the output director which must already exist. While importing stats
are print every 2 seconds.
  As all nodes the ways consist of need to be looked up and the import
is single threaded at the moment, import will get very slow when the
Index doesn't fit into memory. Don't try to import a planet dump if
you don't have at least 4 or better 8 gigs of memory.

- planet_stat

  usage: planet_stat [options] osm-folder
  options:
  -f key val : filter input with regular expressions
  -v : print informations about the ways/nodes

  simply counts the ways and nodes matching the filter. Additionally
prints informations like tags, type.

- osm_server

  osm_server is a fcgi-application that serves osm-data. Data can be
filtered by regular expressions and a bbox.
  The server filters and loads data on the fly which makes it
relatively fast, around 15MB/s when cached or 4MB/s from disk, running
on my notebook (sempron 2800+).
  Disk speed is from a single connection (worst case), multiple
concurrent downloads speed it up due to caching and IO-scheduling.
  The server is single threaded which means for concurrent connection
a thread has to be started.
  Also note that the IDs of ways-nodes are changed due to the internal
data-model.

  The path for the database is read from OSM_DB_PATH environment variable.

  example fcgi configuration (lighttpd):

  fastcgi.server = ( "map" => ( "localhost" =>
                    ( "host" => "127.0.0.1",
                      "port" => 1080,
                      "min-procs" => 1,
                      "max-procs" => 50,
                      "check-local" => "disable",
                      "bin-path" => "/path_to_osmtools/osm_server",
                      "bin-environment" => ("OSM_DB_PATH" =>
"/path_to_osmtools_database/")
                 )))

  please note that lighttpd will always have processes running as
adaptive spawning is not currently supported.

  with this configuration requests may look like this:
  .../map?bbox=<left>,<down>,<right>,<upper>,key=<expression>,val=<expression>
  key and val may be repeated to expand the filter.