[osmosis-dev] keyValueList option, and latest node versions

Brett Henderson brett at bretth.com
Mon Aug 2 14:05:48 BST 2010


Hi Simon,

Sorry for the really slow reply on this.  I was kinda hoping somebody else
with more knowledge would step in and answer your questions.  I'm not very
familiar with the various filtering tasks ...

Some comments below.

On Wed, Jul 7, 2010 at 3:41 AM, Simon Nuttall <simon.nuttall at gmail.com>wrote:

> We use osmosis to merge geofabrik's UK and Ireland extracts to fill a
> database from which we build the CycleStreets.net routing structures.
>
> osmosis
>  -v 100
>  --read-xml data/osm/downloads/great_britain.osm enableDateParsing=no
>  --sort-0.6 type="TypeThenId"
>  --read-xml data/osm/downloads/ireland.osm enableDateParsing=no
>  --sort-0.6 type="TypeThenId"
>  --merge --write-apidb dbType=mysql populateCurrentTables=no
> host=localhost database=britainOSM user=import password=xxx
> validateSchemaVersion=no
>
> That currently takes about 1.5 hours to run, building a 4.5 GB database.
>
> 1. I have just discovered the keyValueList option.
>
> I could use this as a long list of  e.g.
>
> "highway.residential,highway.unclassified,highway.primary,...cycleway.opposite,access.yes...,foot...,bicycle..."
> pairs to filter the ways. I'd estimate there would be about 50 pairs
> in the list.
>
> My question here is would that speed up or slow down the osmosis?
>

If you're writing all output to a database I suspect it would speed the
process up.  The --node-key-value and --way-key-value tasks are both very
simple in implementation and do HashMap lookup to check whether each
node/way should be included.  It could possibly be made more efficient
through the use of a Patricia-Trie index or somesuch, but I suspect it will
be fairly fast anyway.  You may have to write out node and way data to temp
files then merge them together afterwards due to --xxxx-key-value tasks only
passing through the type of data they care about.  You can't use wildcards.

You may wish to check out the --tag-filter task which is much more
flexible.  I believe you'll be able to use wildcards as you require, and it
can also operate entirely using a single stream rather than requiring the
use of temporary files as per the --xxxx-key-value tasks.  I have no idea
how well it performs, but being able to use wildcards should help somewhat.


>
> Is there any scope for wildcards in that list, e.g. cycleway.* ?
>

Yes, using the --tag-filter task.


>
>
> 2.  Duplicate ways / multiple versions of nodes and ways.
>
> The processing we do in CycleStreets after this osmosis detects
> hundreds of duplicate ways and a few tens of duplicate nodes.
>
> Are there osmosis options for just catching the latest versions of the
> nodes and ways?
> (I have to say I don't really understand this area very well.)
>

I suspect your input files also have these duplicates.  Is that the case?

There is no task for fixing files that have multiple versions, but there is
a trick for using the --simplify-change task which I'll have to dig up ...

Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/osmosis-dev/attachments/20100802/b77aaa47/attachment.html>


More information about the osmosis-dev mailing list